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These lecture notes have been prepared for a course in Trieste, part of the Ph.D. program in 
Statistical Mechanics. They focus on the mean field theory of spin glasses, with particular emphasis 
on the presence of a very large number of metastable states in these systems. This phenomenon, 
and some of its physical consequences, will be discussed in details for fully-connected models and 
for models defined on random lattices. This will be done using the replica and cavity methods. 
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I. INTRODUCTION 



A. Why study spin glasses? 

Spin glasses have been intensively studied since the seventies [1, 2]. The original motivation for their study was 
the description of a class of magnetic alloys, but since the beginning it was realized that they are representative 
of a much more general class of disordered systems. The concept of "disordered system" is obviously very generic 
and disordered systems display many different and interesting phenomena whose description might the subject of a 
collection of monographs. 

Here we will limit ourselves to discuss the main physical properties of spin glasses and list some different systems 
that share the same properties. Spin glasses are very interesting for many reasons [3]: 

• Spin glasses are the simplest example of glassy systems. Glassy systems are a class of disordered systems that 
share a common phenomenology, characterized by a very slow dynamics in the low temperature phase. In spin 
glasses, there is a highly non-trivial mean field approximation where one can study phenomena that have been 
discovered for the first time in this context, the most striking one being the existence of many equilibrium states 1 . 

• The study of spin glasses opens a very important window for studying off-equilibrium behavior of glassy systems, 
i.e. their dynamic evolution when abruptly cooled into the low-temperature phase starting form high temper- 
ature. In this framework it is possible to derive some of the main properties of a generic glassy systems, e.g. 
history dependent response [8-10]; this property, in the context of mean field approximation, is in fact related 
to the existence of many equilibrium states. Aging and the related violations of the equilibrium fluctuation 
dissipation relations emerge in a natural way and they can be studied in this simple setting [11-13]. 

• The theoretical concepts and the tools developed in the study of spin glasses are based on two logically equiv- 
alent, but very different, methods: the algebraic broken replica symmetry method and the probabilistic cavity 
approach [9]. They have a wide domain of applications. Some of the properties that appear in the mean field 
approximation, like ultrametricity, are unexpected and counterintuitive. 

• Spin glasses also provide a testing ground for a more mathematically inclined probabilistic approach: the rigorous 
proof of the correctness of the solution of the mean field model came out after twenty years of efforts where 
new ideas, e.g. new variational principles [14], were at the basis of a recent rigorous proof (see [15] and [16] for 
a concise explanation of the main ideas) of the correctness of the mean field approximation in the case of the 
infinite range Sherrington-Kirkpatrick and p-spin models that we will introduce below. 



B. Physical systems 

There are many physical systems that have been described using methods and ideas borrowed from the spin glass 
physics. Below we list some of them in order to give an idea of the wide variety of physical situations. Details can be 
found in the references. 

• Real spin glasses: these are typically metallic materials hosting magnetic impurities located at random positions. 
The spin polarization of the electrons around the magnetic impurity is oscillating at large distance, 

cos2fc F r 

Sind(r) ^ ; (1) 

these are called Friedel oscillations and are related to the existence of the Fermi surface (see chapter 2 of [10] for 
more details). Therefore, the coupling between two spins has random sign and intensity, because the distance 
between two spins is a random variable. The simplest idealization of the interaction between two spins S, and 
Sj is a coupling term —JijSi ■ Sj in the Hamiltonian, and is taken to be a random variable. In these materials 
the disorder (i.e. the values of the Jij) is due to the doping with impurities, so in some sense it is put there "by 
hand" when preparing the sample (it is called quenched disorder in the literature). 



1 This sentence is too vague: one should discuss its precise mathematical meaning; although we will present later a physically reasonable 
definition, for a careful discussion see Refs. [4-7]. 
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• Glass-forming liquids [17-19]: many liquids, when cooled fast enough, freeze in an amorphous state or glass. 
One may think for example to a system of hard spheres, or of point-like particles interacting via a Lennard- Jones 
potential. In the glass, the density profile is not uniform and the particles prefer to stay close to a set of sites 
that is not periodic like a crystal. In this case the local environment of each particle is different. However 
the disorder here is not present in the original Hamiltonian because the interactions between the particles are 
deterministic. We may say that the disorder is self- generated by the system. In addition to hard-spheres and 
Lennard-Jones systems, many more complicated liquids like molecular liquids and polymeric liquids display a 
glass transition. 

• Colloidal dispersions are typically made by mesoscopic particles dispersed in water or other solvents. The 
interaction between the particles can be tuned by adding different components to the solutions, and a wide 
variety of potentials, ranging from purely hard-core to long range interactions have been created. These systems 
usually display, at high enough concentration of particles, a dynamical arrest very similar to the one observed in 
glass forming liquids, Some ideas from glass physics have been applied to the study of these systems, mainly of 
their dynamics [20]. Note that, due to the wide variety of potentials that can be engineered, in some cases the 
underlying microscopic phenomenon might be very different, leading to different arrested phases such as gels, 
stripe phases, etc. 

• Quantum glasses: there are many quantum systems that exhibit a glassy behavior. Obviously, one can consider 
spin glass materials in which quantum effects arc important. But in addition, people have found an electron 
(or Coulomb) glass (see e.g. [21] and references therein) by considering a system of electrons close to the 
metal-insulator transition. The observation of an electronic glass phase has also been reported in high-T c 
superconducting cuprates [22]. A particular glass phase (called valence bond glass) is present in a model of 
hopping electrons on a frustrated lattice [23] . Glass phases are expected also in systems of interacting bosons 
like cold atoms, see e.g. [24] and references therein. Recently, the observation of a superfluid-like response in 
disordered solid He 4 has motivated the study of superfluid glassy (superglass) phases [25]. 

• Random lasers: one can consider a cavity filled with a disordered system, e.g. a solution of mesoscopic particles 
in a liquid or glassy matrix with different refractive index. The particles act as a disordered set of scatterers for 
light; therefore the modes of the electromagnetic field into the cavity are disordered. If an amplifying molecule 
is present in the solution, coherent amplification of the modes can be realized and a lasing phenomenon is 
observed; many modes can be excited at the same time. The dynamics of the phases of the lasing modes can be 
described by an equation that closely resemble that of a spin glass model; this led to the prediction of a glassy 
phase for the system, manifested in a locking of the phases of the modes to random position. This phenomenon 
has been called random mode locking and is the disordered version of the standard mode-locking phenomenon 
that is observed in multimode laser cavities [26]. 

• Granular materials: these are ensemble of macroscopic particles, that, unlike in colloidal systems, are not 
dispersed in a solvent. Their mass is so large that the potential energy due to gravity (~ mgd, with d the 
diameter of one particle) is much bigger than fc^T. Therefore in these systems thermal fluctuations are irrelevant, 
and their physics is dominated by gravity and frictional forces between the grains. A typical example is rice in a 
silos. In addition, the system might be agitated by an external force, like for nuts transported by a truck. The 
dynamics of these systems under external forces is very important for industrial applications. The configurations 
of the grains are typically amorphous, and some concepts borrowed from the physics of glasses have been applied 
to describe them, see e.g. [27]. 

• Biological systems: spin glass models have been used for a long time to describe several different biological 
systems. Probably the most successful example is that of neural networks [28]; in addition, other phenomena 
such as the folding of proteins have been studied using these methods [29-31]. Another important application 
of spin glass models is the inference of correlations hidden in biological data [32-36] . This field is now growing 
very quickly and there are many other applications that can be listed here. 

C. Optimization problems 

In addition to the above long list of physical systems, spin glass techniques have been applied to a large class of 
computer science problems, called "optimization problems": this connection dates back from twenty years at least 
[9, 37]. In an optimization problem, one looks for a configuration of parameters minimizing some cost function (the 
length of a tour in the traveling salesman problem (TSP), the number of violated constraints in constrained satisfaction 
problems, etc.) [38]. 
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As an example, consider a linear system of Boolean equations [37]: it is given a set of N Boolean variables Xi with 
indices i = 1, . . . , N. Any variable shall be False (F) or True (T). The sum of two variables, denoted by +, corresponds 
to the logical exclusive OR between these variables defined through, 

F+T = T+F=T , 

F + F = T + T = F. (2) 

In the following we shall use an alternative representation of the above sum rule. Variables will be equal to or 1, 
instead of F or T respectively. Then the + operation coincides with the addition between integer numbers modulo 
two. 

A linear equation involving three variables is for example x\ + X2 + £3 = 1. Four among the 2 3 = 8 assignments 
of (xi, x 2 , X3) satisfy the equation: (1,0,0), (0,1,0), (0,0,1) and (1,1,1). A Boolean system of equations is a set 
of Boolean equations that have to be satisfied together. For instance, the following Boolean system involving four 
variables 

x\ + x 2 + x 3 = 1 

x 2 + Xi = (3) 
X\ + X4 = 1 

has two solutions: (xi,x 2 , X3, X4) = (1, 0, 0, 0) and (0, 1, 0, 1). A system with one or more solutions is called satisfiablc. 
Determining whether a Boolean system admits an assignment of the Boolean variables satisfying all the equations 
constitutes the XORSAT (exclusive OR Satisfaction) decision problem. 

In spin language the problem can be reformulated as follows. We associate to each variable Xi — 0, 1 a spin 
Si = (— l) Xi . The equation x\ + x 2 = a, where a = 0, 1, can be rewritten as JS\S 2 = 1, where J = (— l) a , and 
similarly for equations involving more variables 2 . The system (3) can be rewritten as 

-SiS 2 S 3 + S 2 Si - = 3 . (4) 

Consider for simplicity a system of M equations, each involving two variables; it is equivalent to 

H = ~ 2 J i3 S i s i = - M > ( 5 ) 

where the sum is over all the pairs that appear in one of the M equations 3 . The XORSAT decision problem is 
therefore equivalent to the following question: is the ground state energy of H equal to — M or not? 

In the decision problem one is asked to determine whether a given set of constraints can be satisfied or not. One 
can also consider the optimization version of the XORSAT problem, that consists in finding the ground state of H , 
or in other words of finding the maximum possible number of equations that can be simultaneously satisfied. The 
connection between (zero temperature) statistical mechanics and optimization should be clear from this example, and 
as we will see the Hamiltonian (5) is a typical spin glass Hamiltonian. 

In spin glasses the Jij are random variables. For instance, in the example (5), we can decide that each variable 
appears in exactly in z equations; the total number of equation is then M = Nz/2. Consider a graph such that each 
variable is a vertex and a link correspond to an equation involving Si and Sj. Then with this choice the model 
is defined on a graph such that each vertex has exactly z neighbors. We give equal probability to all graphs satisfying 
this constraint. For each equation (link) the corresponding coupling is taken as a random variable. 

In computer science, random distribution of instances, such as the one we introduced above, have been used as a 
benchmark to test the behavior of search algorithms, i.e. algorithms that try to find a ground state of H (a solution 
of the problem). In physical language, a search algorithm correspond in some cases to a dynamical rule that, starting 
from a configuration of the spins, attempts to explore the configuration space while looking for the ground state. A 
typical example is Monte Carlo dynamics at very low temperature. The presence of a low-temperature "glassy" phase 
in the model, associated to slow dynamics, is clearly important for the performances of these algorithms. This is an 
important motivation to study the spin glass phases of such Hamiltonians. 

Note that, despite the beautiful studies of the average properties of the TSP, Graph partitioning, Matching, etc., 
based on spin glass methods [9], a methodological gap between the field of statistical physics and that of computer 



2 An equation of length K, xi + • • ■ + = a > is equivalent to JSi ■ ■ ■ Sk = 1- 

3 For a generic system of M equations labeled by a = 1, ■ - ■ , M one has H = — Eali JaSta ■•■S i a = M, where K a is the length of 
equation a and if , • • • , »Jf is the set of variables belonging to equation a. 
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science is far from being bridged. In statistical physics statements are usually made on the properties of samples that 
are typical with respect to some disorder distribution (i.e. distribution of the Jij). In optimization, however, one 
is interested in solving one (or several) particular instances of a problem, and needs efficient ways to do so, that is, 
requiring a computational effort growing not too quickly with the number of data defining the instance. Knowing 
precisely the typical properties for a given distribution of instances might not help much to solve practical cases. 
Unfortunately, statistical mechanics is for the moment unable to tell us precise properties for a given sample, i.e. for 
a given realization of the couplings. 

Finally, note that the recent developments in quantum computing triggered some efforts to study the performances 
of quantum algorithms to solve these problems (see e.g. [39] for a review). Therefore, recently quantum versions 
of the problems (in which Ising spins are replaced by Pauli matrices and a transverse field is added) have been 
considered [40]. Understanding the properties of quantum spin glasses might be important also in this respect. 



D. Models and universality classes 

The simplest spin glass Hamiltonian is of the form: 

1,N 

H = *^2JikSiSk , (6) 

where the J's are quenched (i.e. time independent) random variables located on the links connecting two points of 
the lattice and the S"s are Ising variables (i.e. they are equal to ±1). The total number of points is denoted with N 
and it goes to infinity in the thermodynamic limit. We will always assume that J a — obviously, and Jy — Jji. 
We can consider four models, whose solution is increasingly difficult to obtain [3]: 

• The Sherrington-Kirkpatrick (SK, or fully connected) model [9, 41]: All J's are random and different from zero, 
with a Gaussian or a bimodal distribution with variance A -1 / 2 . The coordination number is z — N — 1 and it 
goes to infinity with N. In this case a mean field theory is valid in the infinite N limit [9]. 

• The Bethe lattice model [42-44]: The spins live on a random lattice such that each variable has z neighbors, 
therefore only Nz/2 J's are different from zero: they have finite variance, it is convenient to choose z -1 / 2 in 
order to have a good limit z —¥ oo. In this case a modified mean field theory is valid. Note that this model 
correspond exactly to the XORSAT problem (5) if the distribution of the J is bimodal (up to a rescaling of H). 

• The large range model [45]: The spins belong to a finite dimensional lattice of dimension D. Only nearest spins 
at a distance less than R interact and the variance of the J's is proportional to 1/R D / 2 . If R is large, the 
corrections to mean field theory are small for thermodynamic quantities, although they may change the large 
distance behavior of the correlations functions and the nature of the phase transition, which may disappear. 

• The Edwards- Anderson (finite dimensional) model [1, 2]: The spins belong to a finite dimensional lattice of 
dimension D: only nearest neighbor interactions are different from zero and their variance is D- 1 ' 2 . In this 
case finite corrections to mean field theory are present, that are certainly very large in one or two dimensions, 
where no transition is expected. The Edwards- Anderson model corresponds to the limit R = 1 of the large 
range Edwards- Anderson model; both models are expected to belong to the same universality class. The large 
range Edwards-Anderson model provides a systematic way to interpolate between the mean field results and 
the short range model. 

As far as the free energy is concerned, one can prove the following rigorous results: 

lim Bethe(z) = SK , 

z— 7-OC 

lim Large range(i?) = SK , (7) 
Jim Edwards- Anderson(D) = SK , 

The Sherrington-Kirkpatrick model is thus a good starting point for studying also the finite dimensional case with 
short range interaction, that is the most realistic and difficult case. This starting point becomes worse and worse 
when the dimension decreases; for instance, it is not any more useful in the limiting case where D = 1. 

In the following we will mostly focus on the mean field theory of spin glasses, which gives the correct solution of 
the fully connected (SK) and Bethe lattice models. This theory is very complex and is already the subject of several 
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books and review papers [8, 9, 46]. Giving a complete account of the mean field theory of glasses is already a task 
that goes beyond the aim of these notes. 

One of the main results of the theory is the existence of two distinct classes of models displaying a very different 
phenomenology: 

1. The models defined above (except possibly the finite dimensional version) belong to a class of models called 
"full replica symmetry breaking" (fRSB). At high temperature they are paramagnetic just like their ordered 
counterparts. On lowering the temperature, they undergo a transition to a spin glass phase; for these models, 
it is a second-order phase transition, associated to a diverging correlation length and to power-law singularities 
controlled by critical exponents. In mean field theory, the low temperature phase is very complex. The equilib- 
rium states are organized in a very complicated and hierarchical way, so that the order parameter of the mean 
field theory is a function. The mean field theory of fRSB models is reviewed in [8, 9]. 

2. A class of simpler models exist, where the (many) equilibrium states are organized in a much simpler way: 
different states are simply uncorrelated, in a sense that we will precise in the following. These models are called 
"one-step replica symmetry breaking" models (1RSB). The transition to the spin glass phase, in these models, 
is quite different from the fRSB case: although it is still second order from the thermodynamic point of view, 
the order parameter jumps at the transition, making it first order in some sense. In this case, the identification 
of a diverging correlation length and associated critical exponents is not evident. The simplest representative 
of this class of models is the spin glass Hamiltonian 

1,N 

H[S] = ^ JijkSiSjSk , (8) 

which is called 3-spin glass. Again, we assume that is zero when two or more indexes are equal, and that 
they are symmetric under permutations of the indexes. More generally one can consider p > 2 spin interactions, 
hence the name p-spin glass. As in the previous case, one can consider the fully connected, Bethe lattice, large 
range and finite dimensional versions of this model, and the relations (7) hold also in this case for the free 
energy. The mean field theory of 1RSB models is reviewed in [46]. 

The analysis of 1RSB models is much simpler than fRSB ones, and fortunately their phenomenology is already 
very interesting and rich; moreover, many interesting systems like fragile glasses and many optimization problems 
are conjectured to belong to this class. Therefore, in the following we will start our analysis by the study of p-spin 
glasses, and then describe (shortly) the solution of the more complicated SK model. 



E. Frustration and quenched disorder 



There are two common ingredients in all the models and physical systems we discussed above: disorder and 
frustration. The disorder, in some cases, is built in the Hamiltonian (the coupling Jjj are random); it represents for 
instance the random position of the impurities. Clearly, the impurities might diffuse in the sample, therefore the 
Jij should be considered as dynamical variables. However, the time scale of this evolution is much larger than any 
interesting time scale, and we can consider the Jij as constant or quenched. 

When computing the partition function, we must then keep the J's fixed: 



From the partition function we can compute observables such as the energy, entropy, free energy, magnetization, etc., 
and these should then be averaged over the distribution of the J's. Therefore the average free energy must be defined 
by a so-called quenched average over the disorder: 



f = fj = -T^logZj. (10) 
In this way, the usual thermodynamic identities are satisfied: for instance, the average entropy is 



df dfj _ 

S = -dT = -lT =SJ - (U) 
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Ideally, we would like to know the properties of the system for each given realization of the J's, that corresponds to 
a given physical sample. Fortunately, one can show that intensive quantities such as /, s, etc., are self-averaging. 
This means that in the large volume limit, they converge with probability one (with respect to the distribution of the 
J's) to the average defined above. Therefore, as far as such observables are concerned, the average over the disorder 
is representative of the behavior of the typical sample. Still, as discussed above, in some applications (mainly in 
computer science) one would like to know properties of rare samples corresponding to particular choices of the J's, 
or to have bounds that hold for any choice of the J's. These problems cannot be tackled using the methods that we 
will describe in the following. 

In other cases, the disorder is self-generated by the system, as for glass-forming liquids. In these cases, clearly, the 
explicit average over the disorder is not needed. 

Another crucial ingredient is frustration. In the examples above, this is due to the fact that the J's have random 
signs. Therefore, a given spin is subject to fields due to their neighbors that have different signs. Some wants it to 
point up and others to point down. For this reason finding the ground state is not trivial and as we will see many 
degenerate ground states may be present. Note that this would not happen if all the J's were negative: in this case 
the ground state, even in presence of disorder, would be simply the configuration where all spins are equal. 

F. What is missing in these notes 

It will be impossible to cover all the relevant issues about the complex physics of spin glasses. In the following, we 
will try to review some aspects of this problem, by alternating general discussions with some more technical sections 
where methods and techniques of general importance will be introduced. 

We will focus more on equilibrium properties of mean field spin glasses, and a detailed investigation of the dynamics 
of spin glasses will not be done. Still, dynamics is very important and is probably the most relevant aspect for making 
contact with experiments. Excellent review exists [13, 46] and might be consulted to go deeper in this important 
subject. The notes are divided in two parts: the first is devoted to fully-connected models and the replica method, 
while the second to Bethe lattice models and the cavity method. We will not discuss finite dimensional models since 
the extension of these concepts to finite dimensional models is still debated; reviews and further references can be 
found in [18, 19] for 1RSB models and [6] for fRSB ones. Some exercises are proposed at the end of each section. 
Ideally they should be done while reading the notes; the appropriate moments are marked in the text by => Ex. N.n. 

II. FULLY CONNECTED MODELS 

We said in the introduction that frustration causes the existence of many thermodynamic states and that this is 
the main interesting property of glassy systems and in particular of spin glasses. The aim of this section is to make 
this statement more precise, by looking to the exact solution of fully connected models: the SK model belonging to 
the fRSB class and the spherical p-spin model belonging to the 1RSB class. Our aim here is to understand the nature 
of the transition and of the low temperature phase; to identify the symmetry that is broken (if any) and a correct 
order parameter; and to discuss what are the relevant susceptibilities that diverge signaling the transition. 

A. Free energy functional 

1. The fully connected Ising ferromagnet 

Before turning to the more complicated case of spin glasses, we will here review very briefly the concept of metastable 
state for the familiar Ising ferromagnet. We will limit ourselves to the fully connected case where the definition is 
much simpler; for a general discussion see [47]. 

Let us then consider the fully connected Ising model, whose Hamiltonian is given by 

l,iV JV 

H ^ = -^E S ^' ~ B T, S < = -—m[Sf-Bm[S] , (12) 

i,j i=l 

where we defined m[S] = S%/N the magnetization per spin. The Hamiltonian H[S], and consequently the Gibbs 

probability P[S] cx exp — [3H{S], depend only on the magnetization m[S]. Therefore, the total probability that the 
system has magnetization m[S] — m can be written as the product of the probability of a given configuration with 
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m[S] = to times the number of such configurations; the latter is a combinatorial factor counting the number of ways 
one can choose N + = N(l + m)/2 spins (out of N) to be equal to 1. We obtain 

P(m) OC e N l0Jm 2 /2+0Bm] ( N \ _ e N[l3Jm 2 /2+fiB m +S°(m)] = e -l3Nf{m) ^ ^ 



S°{m) = lim — log , , /0 = —log — — log — — . (14) 



where 

S°(rn\ = ,,,, , , 

n^oo N & \N(l + m)/2 

The function /(to) defined in Eq. (13) is the large deviation function associated to the magnetization to. If plotted as 
a function of to, it has a familiar form: at high temperature it is a convex function with a single minimum in to = 0, 
while below a critical temperature (in this case T = 1) there are two minima at m = ±to* and a maximum in to = 
and /(to) is no more convex, see figure 1. In presence of a non-zero external field B, one of the two minima has a 
lower /(to) (a higher probability). 

The function /(to) is related to the probability, at equilibrium, to find the system in a configuration with magnetiza- 
tion to. This means that at low temperature, there is high probability of finding the system with magnetization ±m* 
(one of the two values will be preferred for B ^ 0), while there is a low probability of finding an intermediate value of 
to, in particular m <~ 0. In other words, the system spends a lot of time close to configurations with to = ±to*, and 
much less time close to configurations with m <~ 0. On the other hand, to go from —to* to +to*, the magnetization 
must cross m ~ 0; therefore, the number of such transitions must be very small, otherwise the probability of to ~ 
would be large. The only possibility is that the system stays for a long time close to —to*, then performs a fast jump 
to m*, stays a long time there, then performs a fast jump in the other direction, and so on. 

To be more precise we should introduce a model of the dynamics and analyze it in de- 
tails (=> Ex.II.l); a nice detailed discussion of this, by R.Monasson, can be downloaded from 
http://www.phys.ens.fr/~monasson/Appunti/ising.ps. It turns out that the system stays close to ±m* 
during large intervals of time, whose size scales as r± ~ expNt±; the rare jumps between these two states take a 
time that increases only polynomially with N, Tj ump <~ N a . In other words, if prepared close to one of the two 
minima of the free energy, the system remains there, with high probability, for a time that scales exponentially with 
N; this is true for both minima, in particular for the one with higher /(to) that is less probable. The latter is then 
the classic example of a metastable state. 

We are led to identify metastable states with the minima of a suitable free energy function /(to). Before turning to 
a more general definition, it is convenient to highlight some of its properties that will be very useful in the following. 
First of all, note that /(to) is an analytic function of both m and (3. It does not show any singularity at the critical 
temperature. On the other hand, we know that for B = there is a phase transition at T — 1; for instance the average 
magnetization is zero above T — 1 and non-zero below. Indeed the total free energy of the system is given by 



/ = ~ \ogZ= ~ [ dme^ N ^ = min/(m) 

iV iV J m 



(15) 



The singularity of the thermodynamic observables (energy, magnetization, etc.) at the phase transition comes from 
the bifurcation of the minima of /(to), i.e. by the minimization involved in the computation of /, and not by a 
singularity of /(to) itself. This is a very peculiar property of fully connected models; we will see that in more realistic 
models the situation is completely different. Still, the fact that /(to) is analytic in (3 at all (3 suggests that we can 
compute it in a series expansion for small f3 (actually, in our case j3f is just a linear function of /?!). This is the 
strategy that we will follow in the next sections. 



2. Metastable states in fully connected models 

A general result of statistical mechanics (see e.g. [9, 48]) states that it is always possible to decompose the 
equilibrium probability distribution as a sum over pure states. In finite dimensional systems, pure states are defined 
by taking the thermodynamic limit with a given boundary condition [48]. If there are many pure states, one can 
select one of them by adding to the system a small field: the probability distribution of the pure state can be thought 
as the limit of zero field of the Gibbs measure in presence of the field. 

However, in a fully-connected system, such as the Ising model defined in (12), there is no space notion since all the 
spins interact with all others: thus there is boundary, and no boundary conditions can be applied to the system. The 
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pure states can be selected only by using an external field. In this way we are able to define the Gibbs distribution 
restricted to one pure state, P a (Si, ■ ■ ■ , Sn). We can write the decomposition of the Gibbs measure as 

-I3H[S\ 

p(s u --- ,s N ) = = ^ a P a (Su - ■ ■ > ( 16 ) 

a 

where a is an index labeling the states and w a is the weight of each state, ^2 a w a — 1. In general, a pure state is 
specified by P a , or equivalently by the full set of correlation functions (Si ± ■ ■ ■ Si n ) a . A very important property of 
the probability distributions of pure states is the clustering property, i.e. the fact that connected correlation vanish 
at large distance [5]. Since for a fully connected model there is no space notion, the clustering property reads simply 

N 

P a {S lr --,S N ) = '[[Pf'(S i ). (17) 

i=l 

In other words, spins are completely decorrelated within one state. The single-spin probability distribution is specified 
by the average magnetization of the spin Si, mf = ^2 S SP"(S); in fact for Ising spins we have 

ms) = i±f^ . (is) 

Thus, in a fully connected spin model, a pure state a is completely determined by the set of local magnetizations mf, 
i = 1, • • • ,N. Note that this result is valid only for these very special models. 



3. The general definition of the free energy functional 

In the case of the Ising model the two states are characterized by a uniform magnetization, mf = to". In a disordered 
system, for a given sample (realization of the J's) each state is characterized by an amorphous magnetization or density 
profile. Therefore a good starting point to compute the properties of these states is the free energy as a functional 
of the magnetization/density profile 4 [9, 49, 50]. It is a standard object in statistical mechanics, but it is useful to 
review here its definition and fix some notations. As we said above, in the general case the magnetization profile is 
not enough to determine a state since one should specify all the set of correlation function; however, the knowledge 
of mf is already a good approximation. Therefore the definition of free energy functional that we will give in the 
following is an useful concept also for finite dimensional systems. 

Consider a system of spins 5 in which an external magnetic field 6 bi acts on the spin Sf, the free energy is 

-[3F[b] = log e-' m ^+' 3 h * s * . (19) 
s 

Here and in the following we omit the explicit dependence on (3 of the free energy. Note that F[b] is extensive, i.e. 
proportional to N; we will use capital letters for extensive quantities. The local magnetization in presence of these 
fields is 

mM = (Si) b = ~F[b] ■ (20) 

and the susceptibility is 

dm i d 2 F\b] „,,„ m n 

X ij = ^ = -^=H(S i -m i )(S j -m j )) b . (21) 

Note that \ij is a positive matrix 7 . 



4 In field theory it is the generating function of the irreducible correlation functions. 

5 For a system of particles replace the magnetic field by an external (chemical) potential. 

6 We will use the letter b to denote external magnetic fields and the letter h to denote internal fields due to the other spins of the system. 

7 We can write Xij = {SSiSSj) or in matrix notation x = (<5S'<5S' T ). Then, for any vector v, we have v T \v = ((<5S ■ f) 2 ) > 0. This holds 
in particular for the eigenvectors of x, therefore the eigenvalues are all positive. 
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The free energy functional F[m] is defined as the Legendre transform of F[b]: 



-/3T[r. 



-ft max 

6 



F[b] + bitrii 



mm 

b 



log 



-f)H[S]+f)Y. t h(S l - m% ) 



(22) 



in this way b — b[m] is a solution of (20), i.e. it is the set of local fields bi[m] that are needed to enforce the 
magnetizations mj. The maximum condition comes from the fact that the susceptibility (21) is positive, hence the 
second derivative of F[b] + J2i birrii is negative. 
Define, for fixed bi and rm, the average 



(0) = 



J2 S 0[S]e- fjH W +fj biiSi-rm) 



(23) 



the field b is determined by the condition that the derivative with respect to bi of the last expression in Eq. (22) 
vanishes. This condition can be written as 



(S, - rm) = . 



The solution b[m] is the derivative of T[m]: 

h = /-r[m] , 

ami 



dbj _ d?T[m] _ , ^ Q 
drrij drriidrrij 1 



and the free energy F[b] is the inverse Legendre transform of T[m]: 



-f3F[b] = -/3min 



r[m] - ^ b i m i 



(24) 



(25) 



(26) 



the stationarity implies that m[b] is a solution of (25), and it must be a minimum since the second derivative of F[m] 
is positive. This leads to an important result: if there are no external fields, bi = 0, the free energy of the system is 



-riog^. 



-PH[S\ 



minrfml 



(27) 



On the other hand, we have a problem: T[m] is a convex function, hence it cannot have local minima, which is a 
problem if we want to use T[m] to define metastable states. What goes wrong? The problem is very simple and can 
be easily understood by computing F[b] and T[m] for the fully connected Ising model. We consider a uniform field, 
bi = b. Using the definition of F[b] and the results of section II A 1: 



F(b) = -Tlog J dme~ fiN[f(m) - bm] = Nmm[f(m) - bm] 



(28) 



Hence F(b) is convex even if /(m) is not convex; inverting the Legendre transform, we obtain that T(m) is the convex 
envelope of /(m), see figure 1. 

We are therefore interested in F(m) = N f(m) and not really in T(m). How can we define in general a non-convex 
functional F[m], such that its minima are the metastable states? A way out of this problem is to compute the high 
temperature expansion of T[m] defined above. The reason is that at /? = there is no interaction and F[m] — T[m]. 
When expanding around this limit, the convexity can be lost if metastable state appear. One can check explicitly 
that this gives the correct result for the Ising ferromagnet and we will do it at the end of the computation. Note that 
for finite dimensional systems, it is not possible to give a general definition of F[m], and metastable states are more 
difficult to define. Still, expressions of F[m] based on high temperature or low density expansions are often used to 
define metastable states in an approximate way. In the following, we will denote by F[m] the functional obtained by 
a high temperature expansion of T[m] defined in Eq. (22). 



4- The Georges-Yedidia expansion 



We will now derive a high temperature-small coupling expansion of this functional following the strategy of [51]; we 
will see that in fully connected models, where individual couplings vanish for TV — > oo, the expansion can be truncated 
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FIG. 1: The functions f(m) and T(m)/N for the fully connected Ising ferromagnet at B = and T < 1. 



after a finite number of terms, yielding a non-convex free energy functional F[m] whose minima can be identified with 
the metastable states of the system. 

To simplify the notations it is convenient to define A@ [to] = — j3F[m] and Af = f3bi. Then 

A' 3 [m] = log]T e -™+^ *?&- mi ) ; (29) 
s 

where A^ is determined by (Si — TOj) = (the average is on the measure defining A 13 for fixed A^ and to) and 
\? = —d mi A^[m]. For f3 = we can easily compute A [to] because there is no interaction among the spins; we will 
give explicit examples below. 

We wish now to compute the derivatives of A' 3 [to] in f) = 0. ft is convenient to introduce the "observable" 

U[S] = H[S] -(H)-J2 fy*i(Si - to,) , (30) 

i 

and recalling that mi = (Si) for all /3, we get 

(U) = , 

>-<g)-«*0. (3!) 



Then we obtain 



d A^m] = /-H[S] +Y,dp*i(Si - m0 ) = - (JT) 



d/8 

^A>[m\ = (TO) = <C/ 2 ) , (32) 
|> W — <*) + ~<D«>. 

and so on. To compute these at /? = 0, we need to know c^Af (/3 = 0) that enters in C/. In the higher order derivatives 
also higher derivatives of A' 3 appear. 

The derivatives of \ fi at j3 = can be computed recalling that Af = —d mi A^\m\: then 

g Ai 9 _ g 9"^H 



For instance, 



and so on. 



5. Free energy functional for a generic Ising model 
As an example, we consider a model of Ising spins with Hamiltonian 

H[S\ = -\Y t JiiSiS j -BY i Si . 

i=jLj i 

First we need to compute the zero order term: 

A°[m] =log^e^ A ^- m <) = -^A°m 4 + log2coshA? , 

s i 

dA d ^ = (Si - m 4 ) - tanh(A°) - rm = . 



Expressing A° as a function of m, we get 



1 + m,i 1 + nii 1 — mj 1 — m, 



■ log 1 log ■ 

2 S 2 2 B 2 



Note that at /3 = the spins are uncorrelated, therefore (SiSj} — mirrij, (SiSjSk) Q = mimjirik and so on. 
allows to compute 



13=0 



- ( H )o = J E J v m i m j + #E mi > 



d 

=0 rfmj 



(^)o = - E J y m i - 6 



Plugging the last equation into (30) we obtain 

Uo = - \ J2 J ij( S i - m i)( S 3 - m i) > 



that allows to compute the second and third derivatives of A@ . The result is 
( U o)o = ^E4(!- m i K 1 -" 1 ?) ' 



^ 2 

d/3 



3=0 



d/3 2 4 



= 2m, E 4(!- TO i) 



it 
dp 



3 A p [m] 



3=0 



( U o)o = 2 E4 m 4! - ro?Ml - m, 2 ) + £ Jy J ik J jk (l - m 2 )(l - m 2 )(l - m 2 ) . 
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Collecting these results, and going back to the original notation, we obtain the final result 

ozrr 1 ( l + mi 1 l + rrii l-rrii 1 - mA 1 ^ ok? 

-0F[m] = - 2^ I 2 log — — + lo § ) +l3 2^ J ^ m o + P B 2^ 



Hi, 



2 ^ 4mj(l - ro?)mj(l - m 2 ) 



+ X! J ij J ik J jk(l-rni)(l-mj)(l-rnl) 



+ 0(/3 4 ) , 



/36j [m] = atanh(mj) — /3 



+ /3 2 m l ^ J 2 .(l-m 2 ) + 0(/3 3 ) . 



The condition of absence of external field hi — gives the TAP equations: 
rrii = tanh/3/ij , 



hi = b + ^ Jij^j + m J2 J U l - m ") + °^ 2 ) • 



(41) 



(42) 



6. Back to the fully connected ferromagnet 

Before going to the more complicated SK model, it is instructive to look at the fully connected ferromagnet, where 
Jij = jj for all ij. In this case the spins are all equivalent, therefore we expect that at the minimum of the free energy 
the magnetizations are all equal, m,i = m. Then it is easy to see that the terms of order (3 2 and /3 3 vanish for N — > oo. 

Retaining only the 0(/3) term, the TAP equation becomes the familiar mean-field equation 



and the free energy is 



m = tanh[/?(B + m)\ , 



f(m) = N^Flm] = -TS°(m) - -m 2 - Bra 



(43) 



(44) 



that is the correct result, as we anticipated. At this point it should be clear that f(m) is not convex because we 
obtained it by neglecting higher order terms in /?; the true function T(m) is convex but non-analytic at low temperature 
(see figure 1), so we cannot get it by a high temperature expansion. This example suggests that the truncation of the 
high temperature expansion has the effect of making the existence of metastable states manifest in F[m]. Note that 
obviously only local minima of the TAP free energy, i.e. the solutions of the TAP equations with positive Hessian 
dmidmj ' can ^ e interpreted as metastable states. 



7. TAP equations for the SK model 

In the SK model we set B = and the are Gaussian random variables with zero mean and variance Jfj = jj. 

Thus the are typically of the order of 1/y/N, and the terms 0((3 2 ) in the TAP free energy are now relevant. Still, 
the terms 0(f3 3 ) can be neglected as they vanish for N — > oo. 

Note that the term JijTii'mj is a sum of a large number of terms; the signs of and mj arc random, 

but we expect the sign of mj to be correlated with the sign of hi = X)j(^i) Jijf^j- The latter quantity is the sum 
of N terms, each of order 1/y/N, and is therefore finite for large N. As the sign of m; and hi are correlated, 
E^j Jijmirrij = J2i "iA ~ N. 

Conversely, the term 0(/3 2 ) in the free energy is the sum of a large number of terms, all of positive sign. Therefore 
in this case the fluctuations are less important and we can replace, for large N, the J 2 - with their average. Defining 
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1 = ^Ei™!* we g et 

-pF[m] =Y J S°(m l )+p 1 -Y J J^™ 3 +N^{l-q) 2 , 

*j ^ (45) 

hi = ^] Jij-rrij + Pmi(l - q) . 

that are the TAP equations for the SK model. 

At high temperature these equations have only the paramagnetic solution m, = 0. Now we can study the stability 
of this solution on lowering the temperature. The stability matrix for the paramagnet is obtained from (45): 

d 2 F[m] 



dmidrrij 



= (/3 + p-^Sa - Jij ; (46) 

nij-0 

the stability condition is that all the eigenvalues should be positive. The spectrum of the matrix is known to be 
the Wigner semicircle defined in the interval [—2, 2]. For T > 1, one has j3 + /3 _1 > 2, and the paramagnet is stable. 
However, at T = 1 the spectrum touches zero, hence zero modes appear suggesting that below T = 1 the paramagnet 
becomes unstable. 

However for T < 1, again /? + > 2, so it seems that the paramagnet is stable for all temperatures. This strange 
result is in fact incorrect: the paramagnet becomes indeed unstable at low temperature. A stringent indication of this 
fact is that its total free energy is f par a — F[m ~ Q]/N ^ —(3 /A — Tlog2; and the entropy is S para = —dF para /dT = 
log 2 - /3 2 /4, that becomes negative for T < l/(2v4og2) - 0.911. This is a nonsense since we are dealing with Ising 
spins, the states of the system are discrete and the entropy must be positive. The instability of the paramagnet 
actually happens at T = 1, but it is missed by the TAP equations (45) because of the approximations we made; there 
are different way to understand this. For instance, one can look at the leading terms in the small /3 expansion; the 
resummation of these terms is divergent when 1 — (3 2 J 2 (1 — q) 2 > 0, which shows that the TAP equations (45) do 
not make sense for the paramagnet (q = 0) at f3 > 1 [50]. Alternatively, one can derive the same condition using the 
cavity method that we will discuss in the following [52]. This fact points out that the approximations we made in 
neglecting higher order terms and substituting others with their average are not completely harmless. In fact, while 
they are correct for stable states, they are not for unstable states and doing them blindly might stabilize solutions 
that are otherwise unstable. 

What happens then below T = 1? At low temperature the TAP equations have many solutions with m, ^ that 
we would like to interpret as (stable or metastable) thermodynamic states. However, the situation for the SK model 
is rather complex, and it is better to investigate first a much simpler model, the spherical p-spin model. 



8. Spherical p-spin model 



We will now compute the TAP free energy for the spherical p-spin model that we will study in details in the 
following. 

First of all, we replace Ising spins by real variables <7j with the constraint ^ of = N: the spins live on the 
TV-dimensional sphere of radius y/~N and for this reason this is called spherical model. This simplification is very 
convenient for analytical calculations, but it is not useful for the SK model because the spherical version of the SK 
model is equivalent to a ferromagnet [53]. 

The p-spin model is defined by the Hamiltonian 



H W] - ~-f Jh-ipVh ■■■0-i p = - Jii-i p <7ii---<7i; 

ii-'-ip ii<i2<---<i P 

where the J are again Gaussian random variables with zero mean and average 

! 



pi 



Using the integral representation of the delta function, the zero order term is given by 

e A°[m] = J da5 (y-o- 2 - n\ eE. = ^ | do-e-^^+vN+^x^-mt) 



du 

27 6XP 



(47) 



(48) 



(49) 
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For large N we can evaluate the integral via a saddle point; in addition the stationarity condition for A° gives 
A 4 ° = 2/j, mi . Then 



A°[m] = Nstf, 
the stationarity condition gives \i = 2 (i- q ) ano - fi nau Y 



1 ( -k 
M(l - g) + tt log - 



(50) 



^°[m]=JV-Iog(l-g) 



(51) 



up to an irrelevant constant. By a similar computation one can show that the spins are uncorrelated up to 1/N 
corrections, (cricrj) — mi-raj; note also that (erf) = 1 due to the spherical constraint. The 0((3) term is then simply 

pi Sii-.-ip Jii-'-iplTT-i! ' ' ' m i p - 

The operator Uq is 



Uq = ^2 J h-i P [ a ii ■■■dp ■■■m ip -p(o ll -m il )m l2 ■ ■ ■ m ip ] . 



pi . 
'i ■■■*p 



(52) 



To compute the 0(/3 2 ), we assume as in the SK case that we can replace J 2 by its average. Therefore in computing 
the average of Uq we must keep only the terms with the same coupling, i.e. such that the indices i\ ■ ■ ■ i p are equal 
up to a permutation; otherwise, the two J have random sign and the contribution is sublcading for large N. It is also 
useful to recall the relation (U(ai — rrn)) — 0. Then we get 



it) 



N 



13=0 



= {U^Q^^[l-pq p - 1 +q P (p-l)} , 



and the TAP free energy is [46] 

/N = ^H = -^Mi-?)-^ E 



■mi 



tL[l- pq P-i +q P(p-l) 



(53) 



(54) 



9. Summary and remarks 



In this section we computed the free energy functional for some spin glass models that we will discuss in the next 
sections. It is useful to summarize some important remarks that emerged during the discussion and that will be useful 
in the following: 

1. In performing the high-tcmpcrature expansion, wc did not really define F[m] by taking the maximum as in (22). 
Instead, we continued the /? = solution. This is not completely correct since the convexity of F[m] is lost in 
this way. 

2. The "advantage" is that the local minima of F[m] can be considered as metastable states, as we discussed in 
the case of the ferromagnet in external field. We will go back to this in the next section. 

3. The approximations made in deriving F[m] for disordered models can have important effects on the stability of 
the solutions, in particular they stabilize solutions (e.g. the paramagnet) that are otherwise unstable. 

Given these remarks, we now turn to the analysis of the solution of the TAP equations for the simplest case of the 
spherical p-spin. 



B. Metastable states and complexity 



1. The simplest example: the spherical p- spin glass model 



In the last section we defined the spherical p-spin model. This is the simplest spin glass for reasons that will be 
clear in the following, therefore it is a good starting point to understand the physics of spin glasses [46] . 
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FIG. 2: (Adapted from [12]) Sketch of the evolution in temperature of the TAP states for the spherical p-spin model. Each 
group of TAP states of free energy / can be followed in temperature until it becomes unstable and disappears. The complexity 
vanishes continuously at the ground state / m »n and goes abruptly to above the maximum free energy f max - The bold 
line is the free energy /* of the states that dominate the partition function. The dashed line is the equilibrium free energy 
/* — TE(/*) that takes into account the entropic contribution of the degeneracy of the states and is equal to the free energy 
of the paramagnet. 



As we discussed above, the Gibbs measure can be decomposed in a set of pure states, that in fully connected 
models are completely determined by the set of local magnetizations mf, i = 1, • • • , N. Note that the same holds for 
spherical spins because the distribution of a single spin turns out to be Gaussian with (of) = 1 (due to the spherical 
constraint), therefore the only free parameter is the average, m z a . The local magnetizations of pure states are the 
minima of the TAP free energy functional F[m] [9, 49]. The weight w a of state a is proportional to exp[— f3Nf a ], 
where f a — F[mf]/N. The TAP free energy F[nii] depends, in general, explicitly on the temperature, so the whole 
structure of the states may depend strongly on temperature. 

We derived the expression (54) of the TAP free energy for the fully connected p-spin models, from which the 
distribution of states can be computed. Here we will not perform the full computation, that can be found in [46] . We 
will explain the result, and then in the next section present a different and simpler method to obtain the same result. 

A peculiar property of the spherical p-spin model, which simplifies a lot the description of the results of the TAP 
computation, is that the dependence of the free energy functional on T is very simple. Indeed, the states are labeled 
by their intensive energy e at T = 0. The number of states of energy e is 17(e) = cxpA^E(e); the function £(e) 
is called complexity: it is a concave function that vanishes continuously at the ground state energy e m ,„ and goes 
discontinuously to above some value e max . At finite temperature, the minima are "dressed" by thermal fluctuations 
but they maintain their identity and one can follow their evolution at T > 0. At some temperature T max (e), 
thermal fluctuations are so large that the states with energy e become unstable and disappear, until, at high enough 
temperature T > Ttap, only the paramagnetic minimum, m, = 0, survives. The temperature evolution of the states 
is sketched in figure 2. 

At finite temperature, the number of states of given free energy density / is f2(/) = ^2 a o~(f — fa) = CX P^VS(/), 
where £(/) = S(e(/)) and e(f) is the T = energy of the states of free energy /. The function £(/) vanishes 
continuously at / = f m in an d drops to zero above / = / mox ; a qualitative plot of £(/) is reported in Fig. 3. The main 
peculiarity of p-spin models is that an exponential number of metastable states is present at low enough temperature. 
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2. The partition function 

One can write the partition function Z, at low enough temperature and for N — > oo, in the following way: 

Z = e -P»h«<X) „<£ e -PN fa = r df J2S(f-f a )e-^ 

a f a (55) 

= J djn{f)e-' m f - J ma * dfe N wn-w ~ e w)-wi ; 

where /* G [f m in, fmax] is sucn that $(/) = / — TS(/) is minimum, i.e. it is the solution of 

rff = T ' (56) 

provided that it belongs to the interval [/ m j n , / ma x]- Starting from high temperature, one encounters three temperature 
regions: 

• For T > Td, the free energy density of the paramagnetic state is smaller than / — TS(/) for any / € [fmin: fmax], 
so the paramagnetic state dominates and coincides with the Gibbs state (in this region the decomposition (55) 
is meaningless). 

• For Td > T > Tk, a value /* £ [fmin, fmax] is found, such that /* — T£(/*) is equal to f par a- This means 
that the paramagnetic state is obtained from the superposition of an exponential number of pure states of higher 
individual free energy density /*. The Gibbs measure is splittcd on this exponential number of contributions: 
however, no phase transition happens at Td because of the equality /* — TE(/*) = f para which guarantees that 
the free energy is analytic on crossing Td- 

• For T < Tk, the partition function is dominated by the lowest free energy states, /* = f m in, with £(/ TO j„) = 
and ftot(T) = f m in — TY,( f m in) — fmin- At Tk a phase transition occurs; the free energy and its first derivatives 
are continuous but the second derivative of ftot with respect to T (the specific heat) has a jump. 

Actually, the paramagnetic solution m, = disappears for T < Td] like in the SK model, this cannot be obtained 
directly from the TAP free energy, but is signaled by other inconsistencies. For instance, if one performs a dynamic 
computation below Td, one can show that starting from a random configuration the system remains confined in a 
small region of phase space and the correlation function of the local magnetization does not decay to zero, as it should 
in a true paramagnetic phase. In the range of temperatures Td > T > Tk, the paramagnetic state is replaced by 
a strange state, in which the phase space of the model is disconnected in an exponentially large number of states, 
giving a contribution £(T) = E(/*(T)) to the total entropy of the system. This means that the entropy s(T) for 
Td> T > Tk can be written as 

s(T) = E(T) + s mb (T) , (57) 

s V ib(T) being the individual entropy of a state of free energy /*. This scenario is realized in a very simple completely 
solvable model, the so-called Random Energy Model (Ex. II. 2). 



3. A method to compute the complexity 

How can we compute the properties of the metastable states, for example the density of states 0(/)? For systems 
that present a structure of the free energy landscape similar to p-spin glasses, a general method to compute the 
complexity as a function of the free energy of the states without directly solving the TAP equations has been proposed 
in [54]. 

The main problem we have to face is that, unlike in a ferromagnet, the states cannot be classified according 
to a symmetry. In fact, the local magnetizations mf are different and amorphous in each state. In principle, we 
should put an infinitesimal local magnetic field, different for each spin, in order to select a state. Moreover, the 
local magnetizations of the states depend on the couplings J, therefore we must put the small field before taking the 
average over the disorder, and the field should be correlated with the disorder. Therefore we cannot study the states 
by selecting them according to an external field, as we usually do in standard phase transitions. 

The idea of [54] is to consider m copies of the original system, coupled by a small attractive term added to the 
Hamiltonian, that favors configurations in which the m copies have similar local magnetizations. The coupling is then 
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FIG. 3: (From [55]) A sketch of the complexity as a function of the free energy density for system belonging to the p-spin class. 
The value /*(m,T), solution of ^ = ^, is also reported. 



switched off after the thermodynamic limit has been taken. For T < T dl the small attractive coupling is enough to 
constrain the m copies to be in the same TAP state. At low temperatures, the partition function of the replicated 
system is then 

rfmax 

Zm = e -f3N*(m,T) ^ ^ e -/3JVm/ a = / ^ g iV[E(/)-/3m/] _ gJV[S(/*)-/3m/*] ^ (gg) 

a J fmin 

where now f*(m, T) is such that $(m, /) = mf — TS(/) is minimum and satisfies the equation 

~df = ~T (59) 

As a result, we see that an additional weight to has been given to the term —/3f in (58). 

If to is allowed to assume real values by an analytical continuation, the complexity can be computed from the 
knowledge of the function $(m, T) = mf*(m,T) — TS(/*(m, T)). Indeed, it is easy to show that 

d*(m,T) 

dm ( 6Q j 
S(m,T) =S(/*(m,T)) =m l d ^ m = m /3f*(m,T) - /3$(m,T) . 



The function can be reconstructed from the parametric plot of f*(m, T) and S(m, T) by varying m at fixed 

temperature. 

The glass transition happens when (3 equals the slope s (T) of £(/) in / = f m in, so Tr- is denned by TkSo(Tk) = 1. 
If to < 1, the value of f*(m, T) corresponds to a smaller slope with respect to m — 1, so the glass transition is shifted 
towards lower values of the temperature, see Fig. 3. For any value of the temperature T below Tk it exists a value 
m s (T) < 1 such that for to < m s the system is in the liquid phase. The value of m s (T) is defined by the condition 
f*(m,T) = J m in{T) or equivalently S(/*(m,T)) = 0. The free energy for T <T K and to < m s {T) can be computed 
by analytic continuation of the free energy of the high temperature liquid. Since the free energy is always continuous 
and it is independent of to in the glass phase (being simply the value f m in(T) such that S(/ m m) = 0), one can 
compute the free energy of the glass below T K simply as fgi ass (T) = f m in{T) = ®(m s (T),T)/m s (T). 

This method allows to compute the complexity as well as the free energy of the glass, i.e. of the lowest free energy 
states, at any temperature, if one is able to compute the free energy of m copies of the original system constrained 
to be in the same free energy state and to perform the analytical continuation to real to. In [56] it was applied to 
the spherical p-spin system and it was shown that the method reproduces the results obtained from the explicit TAP 
computation (see also [46] for a detailed discussion). In the next section we will discuss this computation in detail. 



20 



4- Replicated free energy of the spherical p-spin model 



We wish to compute the free energy of m copies of the original system, coupled by an attractive term. Note that 
the to copies all have the same couplings J, therefore we have to take the average over the J of the free energy of the 
total system: 



$(ro,T) 



T T f 

■— \ogZ m = -— log / Dai ■ ■ ■ £>a m e-/ 3 ( ff -'[ CT il+-+^km])+coupimg _ 



(61) 



To lighten the notation we included the spherical constraint in the integration measure that we now denote by Da, 
therefore Da = (ft. da l ) of = N). 

Let's forget for the moment about the coupling, whose role will be clear later. The main problem now is that we 
have to perform the average of the logarithm of the partition function, which is not an easy task. For this we make 
use of the following formula for the logarithm: 



logx = lim d n x n , 



T 



n-s-0 



$(m,T) = — - lim S„(Z ro )" . 

iV n— >0 



(62) 



This means that we have to compute the average over the disorder of (Z m ) n . This is difficult for real n, but can be 
done for integer n. In fact, for integer n (and to) we get 



{Z m y = J Da,- ■ ■ Da nm e-0( H A*i]+-+HA*nm]) , 



(63) 



where we now have a system of to x n copies or replicas; by convention we assume that replicas {1, • • • , m} are the 
original ones (coupled), and {to + 1, ■ ■ ■ , 2to} are a copy of them (coupled), and so on. Note that there is no coupling 
between replicas belonging to different blocks {1 + Im, ■ ■ ■ ,(£ + 1)to}. 

We now follow closely the presentation of [46]. Labeling replicas by an index a = 1, • • • , nm, and dropping irrelevant 
normalization constants, we get: 



{ZmY = J Dot J] J dJ tl ... lp exp 



h < - <i P ' 

= / Da'! II exp 

il<---<i p 



a=l 



I 



ANP- 



/ 



= / Daf exp 



/S 2 



a,b 

N 



(64) 



ANP' 



l,mn 
a,b \ i 



Dal exp 



o2 n 
ab 



Note that when taking the average over the J's we had to take into account that they are symmetric under permu- 
tations, therefore the average must be taken only over i\ < ■ ■ ■ < i p . 

We can see here the replica trick at work: we started from a set of interacting spins and uncoupled blocks of replicas, 
and averaging over the disorder we decoupled the spins, but coupled the replicas; note however that the decoupling 
of the spins holds only for a fully-connected model. In particular, the overlap between two different replicas of the 
system appeared in the calculation: 



(65) 



Note that Q(a a 7 a a ) — 1 due to the spherical constraint. The overlap is a very important concept in spin glass physics: 
it measures how much the configurations of replica a and replica b are different. 

We introduce now a delta function that fixes the overlap and use the integral representation of the delta function: 



1= / dQ ab S[NQ ab -J2<c 



dQ ab d\ ab e NX ^° 



(66) 
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note that the integral over X a b is done on the imaginary axis in the complex plane. For a ^ b, we integrate over Q a b, 
so the above line is equal to 1, and we can insert it in the partition function 8 , while for a = b, we must fix Q aa = 1 
and the delta function just imposes the spherical constraint (hence we do not integrate over Q aa ). We obtain 



(Z m ) n = J dQ ab d\ ab da? exp 



/3 2 N 
4 

ab ab i ab 



(67) 



dQab d\ ab exp [N S{Q, A)] 



with, 



S(Q, A) = !L Q p ab + E KbQab - \ log dct(2A ab ) (68) 

ab ab 

In (67) the integration over Q a b is performed for a < b and it is assumed that Qfca = Qab and Q aa = 1, while the 
integration over X a b includes also a = b to enforce the spherical constraint. The sums in the exponentials are over all 
the indices, including a = b. 

The advantage of this form of the integral is that we can use the saddle point (or Laplace, or steepest-descent) 
method, to solve it in the limit N — > oo. This simplification is a consequence of the mean- field structure of the model, 
and it is the result of the decoupling of the sites operated by the use of the replica trick. The saddle-point method 
states that in the limit N — > oo the integral (67) is concentrated in the maximum of the integrand. However, we have 
to be careful here, for a twofold reason. First, the free energy is given by 

$(m, T) = -T lim -J- lim d n [ DQ ah DX ab exp [N S(Q, A)] (69) 

TV— >oo 1\ n— >0 J 

and thus we should first take the limit n — > 0, and then N — > oo. Unfortunately, we are unable to do this: S is not 
an explicit function of n, and moreover we need to send N — > oo first to solve the integral. As a conclusion, we need 
to exchange the order of the two limits, solve the integral, find a parametrization of the matrix Q a b 7 and finally take 
the n — > limit at the end. This is clearly quite dangerous from a strict mathematical point of view. 

The second point we have to pay attention to, is what do we actually mean by "maximum" of S. The problem here 
is that the number of independent elements of Q a b is n(n — l)/2, which becomes negative is the limit n — > 0. It is 
hard to say what is a maximum of a function with a negative number of variables. There is however a criterion we can 
use to select the correct saddle point: the corrections to the saddle point result are given by the Gaussian integration 
around the saddle point itself. This integration gives as a result the square root of the determinant of the second 
derivative matrix of S, and thus, in order to have a sensible result, we must have that the analytic continuation of 
all the eigenvalues of this matrix are negative. Summarizing, we have to select saddle points with a negative-defined 
second derivative of S [57]. 

At this point we can proceed with the saddle point calculation. We first find the stationary point of S with respect 
to X a b- By using the general formula, 



i) 



dM ab 



we get. 

and thus we get the final result 



logdetM^^M- 1 )^ (70) 
1X ab = (Q-^ab (71) 



J dQab e NS ^ 



ff2 i (72) 

= T E Qa» + o lo g dct Q + C'nm, 



ab 



where C is a numerical constant and the saddle point on the Q still has to be found. Note that the last term can be 
omitted, since it gives a shift proportional to mT of the free energy, which is just a constant shift of the entropy. 



Note that since Q a \, is symmetric we do not need to fix Q^ a independently. It is convenient to do so in order to have the sums running 
over all values of a, b. It is easy to check that an equivalent choice is to insert the delta function only for a < b, but change A a (, in 2A a i,; 
this change of variable amounts to a constant in front of the partition function. 
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5. 1-step replica symmetry breaking 



The saddle point equation for Q is complicated and cannot be solved in general. We need a simple ansatz on the 
form of the matrix Q in order to find the solution. How can we guess the form of Q? At this point remember that 
the nm replicas are divided in blocks of to replicas, such that each block is coupled while replicas in different blocks 
are not coupled. The matrix element Q a b, according to (65), measures the overlap of replicas a and b, i.e. how much 
the two replicas are close to each other. Therefore, it is very natural to assume that replicas that are in the same 
block have high overlap due to the external coupling, while replicas in different blocks have much smaller overlap: in 
practice we can assume that their overlap is zero, as it would be for completely uncorrelated replicas. Moreover, we 
assume that the overlap between coupled replicas is the same for any pair of replicas a ^ b belonging to the same 
block. This ansatz, that is the simplest possible, is called 1-step replica symmetry breaking (Irsb) for reasons that 
will be clear in the following. 

The IRSB ansatz for Q reads, for n = 2 and m = 3: 



Q 




(73) 



and we can use the relation 




(l-q)™-i[l + ( m -l)q] 



to get 



detQ = {(l-q) m - 1 [l + (m-l)q]} n . 

Substituting this ansatz in (72) we get S(Q) = —3nm4>iRSB where the Irsb free energy is 

1 ( B 2 to— 1 1 1 

<t> 1R3B (m, q, T) = -— | P - [1 + (to - l)<f ] + log(l - q) + - log [l + (to - l)q] j . 



(74) 



(75) 



(76) 



Note that this expression, that has been derived for integer to and n, is perfectly well defined also for real to and n, 
so that now we can perform the continuation ton->0 and real to. Finally 



T 

$(m, T) = —— lim d n exp [ - 8nmN(j> 1R sB(m, q* 

iV n— >0 



m<p 1RS B(m,q,T) . 



(77) 



We must find the stationary point of this expression with respect to q. An important remark is that, due to the simple 
structure of the saddle-point matrix Q, the result for $(to,T) is equivalent to the one that would be obtained by a 
direct computation of — jj-\ogZ m without introducing the n additional replicas. This is due to the fact that replicas 
in different blocks are completely uncorrelated, and therefore S(Q) is simply proportional to n. This fact is very 
specific to the p-spin model and might be false in more general cases; in these cases the calculation might be more 
complicated, but the general strategy outlined above remains valid. We will see an example later on, in =>• Ex. III. 1 
(but it's too early to do the exercise at this point). 



6. Dtscussion 



After having written all these formulae, it is worth to take a break and think a little bit on what we have done. We 
assumed that replicas in different blocks were uncorrelated, while replicas in the same block were correlated and we 
assigned them an overlap q. We said that this is due to the external coupling that was present from the beginning 
but never written explicitly in the free energy. The reason for not writing the external coupling is that we do not 
really need it; indeed, the free energy (77) as a function of q always has a stationary point at q = 0, but for low 
enough temperature, a second stationary point at q* =/= appears. The external field is needed only to make this 
second stationary point stable, and has to be switched off at the end of the computation. In other word, we used the 
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external field only to select the g ^ stationary point of (77). This is very similar to what is usually done in first 
order phase transitions by selecting one state or the other using an infinitesimal external field (e.g. in a ferromagnet 
the states ± can be selected by an infinitesimal positive or negative magnetic field). Therefore, if we are interested 
in the partition function of to replicas in the same state, as we did in the previous section in order to compute the 
complexity, we should always take the solution with q ^ 0, if it exists. 

In figure 4 we draw, in the (to, T) plane, the region where a solution with q ^ is found (=> Ex. II. 3 and II. 4). 
This region is delimited by the blue line to*(T), and a solution q*(m,T) is found for to > m*(T). In this region, we 
can compute, using Eq. (60), the complexity and free energy as a function of to. 

£(to,T) = m 2 a m [TO- 1 i 8$(TO,T)] = m 2 d m [^ 1RSB {m,q*,T)} , 
/(to,T) =<9 m $(m,T) = d m m<p 1R s B {m,q* ,T) . 

It turns out that starting from m = max{0,TO*(T)} and increasing to, both f(m,T) and £(m,T) increase up to a 
maximum and then decrease on increasing to. This leads to two branches of the parametric curve £(/), of which only 
one is physical: the one corresponding to decreasing f and E, as from Fig. 3 it should be clear that both / and £ 
must decrease on increasing to. 

We then define a line m = m<i{T) (green line) where /(to, T) is maximum, that corresponds to the threshold values 
fmax(T). The complexity is finite for to = md{T), and decreases on increasing to, until it vanishes on a second line 
m 8 (T) , that is also reported in figure 4 (red line) . Above this line, states do not exist anymore because the complexity 
becomes negative. Therefore, states are found for m s {T) > m > TOd(T), and TOrf(T) corresponds to f max (T) while 
m s (T) corresponds to f m i n {T). Note that at high temperature the two lines m<i(T) and to*(T) become practically 
indistinguishable, and at higher temperature the three lines touch; this is the temperature Ttap above which only 
the paramagnetic state survives, see figure 2. 

From the previous discussion, in particular equations (55), (56) and (58), (59), we see that to = 1 corresponds, 
obviously, to the equilibrium partition function of a single copy of the system. When the line TOrf(T) crosses to = 1, 
the saddle point in (55) is exactly equal to f max '- this is the point where the paramagnet breaks in many states and 
defines the temperature T^. Similarly, the point where the line m s (T) crosses m = 1 corresponds to the point where 
the saddle point is equal to f m im i-e- the phase transition Tr. 

For T < Tk, the saddle point of (55) is always f m i n , that can be computed by following the line m s (T). Note that 
m < 1 and, recalling that £ = along m s , from Eq. (60) one obtains: 

= 4>iRSB(m s (T),q*(T),T) . (79) 

m=m 3 (T) 

This last result is very important. Indeed, for T > Tk the free energy of the system is equal to the free energy of the 
paramagnet, corresponding simply to 4>irsb(,™ = 1) = — /3/4. Below Tk, instead, the free energy of the system is 
given, comparing (79) and (78), by the extremization of 4>irsb with respect to both m and q. Actually, the extremum 
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is a maximum, as can be seen by an explicit computation. Remarkably, this last statement (the free energy is the 
maximum of 4>\rsb with respect to m and q) can be rigorously proven [16]. 

To summarize, the replica method allowed us to fully characterize the thermodynamics of the spherical p-spin 
model, in particular to compute 

• the free energy of the paramagnetic phase 

• the free energy of the glass phase 

• the distribution £(/) of all the metastable states as a function of / and T 

The main assumption we made in the derivation is that the states are all equivalent: they have self-overlap q and zero 
mutual overlap, i.e. they are randomly distributed in phase space. This is expressed by the lRSB structure of the 
overlap matrix Q a t where all nonzero entries, corresponding to the replicas in the same state, have the same value q. 
This property is exact for the spherical p-spin model, but this is somehow exceptional: in particular this is not true 
for the SK model, where the states are organized in a very complicated structure. 

7. Spontaneous replica symmetry breaking: the order parameter 

To conclude this section, we will discuss now the concept of spontaneous replica symmetry breaking. Suppose that, 
ignoring completely the above discussion, we try to compute directly the free energy of one single copy of the system. 
Then, using the replica method, we would write 

/ = -^logZ = -^lim^. (80) 

Formally, this computation is completely identical to the one of the previous section, Eq. (63), with v — mn. However, 
now there is no external coupling that suggests us the structure (73) of the overlap matrix. Therefore the simplest 
guess would be simply to set q = 0, i.e. to assume that Q a b is a diagonal matrix and all the replicas are uncorrelated. 
Doing this we would get the paramagnetic solution, that is wrong below Tk- As guessed by Parisi [9], the correct 
solution is to assume the structure (73) also in the computation of (80), and consider m and q as variational parameters 
to be optimized. This is exactly what we obtained previously in Eq. (79), which shows that these two conceptually 
different strategies lead to the same result for the free energy of the glass (which is the correct one [16]). In this case, 
replica symmetry breaking is not imposed by an external field but appears as a spontaneous symmetry breaking, and 
this was historically how it was introduced in order to find the correct solution. 

The above discussion shows that q is the order parameter of the transition: it is zero in the paramagnetic phase, while 
it jumps to a nonzero value in the glass phase signaling the spontaneous breaking of replica symmetry; alternatively, 
we can introduce an external field coupled to this order parameter in order to compute the properties of the low- 
temperature phase, analogously to a magnetic field for a ferromagnet. Therefore the transition is of first order from 
the point of view of the order parameter, but recall that it is of second order from the thermodynamical point of view. 

C. The SK model: full replica symmetry breaking 

We can now go back to the SK model. In this case, the situation is much more complicated. The study of the TAP 
equations for the SK model was initiated already in the original paper [49] (the reprints of this and other relevant 
papers can be found in [9]). The TAP equations have many solutions as in the case of the spherical p-spin model, 
however in this case the computation of their complexity is much more complicated. It was started in [58], but a 
recent revival of interest led to a more complete understanding of the problem, that is reviewed in [59] and references 
therein. 

There are two main reasons for this difficulty: first, the equilibrium states in the spin glass phase (i.e. the lowest 
energy states) are not independently distributed as in the spherical p-spin, where different states have zero overlap. 
The mutual overlaps between equilibrium states are organized in a complicated pattern [9]. Second, the metastable 
states (with free energy bigger then the equilibrium states) are not well-defined minima as in the spherical p-spin: 
indeed, for finite N they come in pairs, one being a minimum and the other a maximum. The two coalesce in the 
N — > oo limit forming a saddle, i.e. a state with one zero mode. 

In the SK model the existence of these marginally stable states is not important as far as the equilibrium properties 
are concerned, since they do not appear in the computation of the partition function: the transition is from the 
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high-temperature paramagnetic state to the low-temperature equilibrium states, and is second-order also from the 
point of view of the order parameter: the local magnetizations are small close to T c [49] and so are the overlaps [9]. 

Here we will only discuss the definition of the order parameter for the SK model and its solution using the replica 
method. For more details on the TAP approach see [9, 49, 58, 59]. 



1. The overlap distribution 

As we already discussed, in presence of many pure states the Gibbs measure decomposes as P[S] — ^ Q w a P a [S], 
with ~^2 a w a = 1. We also said that each state is specified by its local magnetizations mf = (Si) a . We can define the 
overlap of two states 

*# = jf E << = jf E <*>« • ( 81 ) 

i i 

In the case of the spherical p-spin, we found that the natural order parameter was the overlap of two replicas defined 
in (65); the infinitesimal coupling forces the two replicas in the same state, so we have 

q= (Qab) =E W «^ : E^)a(^)a =H W *JjH« = ^W a q aa , (82) 

a i a i a 

and the (thermal average of the) overlap is just the average self-overlap of a state a. Recall that for the spherical 
p-spin we assumed that q aa = q, and that different states were uncorrelated, i.e. that q a p = for a ^ j3. 
In a more complicated situation we might be interested in the probability distribution of the overlap: 

P(q) = ^2 w a Wf}5{q - q a p) ; (83) 

a/3 



in general P(q) will depend on the couplings J and we will then consider its average over the disorder, P(q). We wish 
now to work out a connection between this quantity and the matrix Q a t, that appears in the replicated free energy 
(72). Let us then consider the following quantity, 

9 (1) = ^E^' (84) 

i 

where the average is over the Gibbs measure. By using the decomposition in pure states, we can rewrite q^ in the 
following way, 

<7 (1) = ^EE WqU; ' 9 ( S i)a( S *)p = E WaW f i q °-P = f dq E WaW f i <7 = / dq P(q) q (85) 

i a/3 a/3 J a/3 J 

Therefore q^ is the first moment of the overlap distribution, averaged over the disorder. By using the clustering 
property, we can easily find a generalization of this formula [9] , 
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(86) 



= J[k E ^2 w aWfi{Si 1 ) a ...(S ih ) a {S il ) fi ...(Si h )p= I dqP(q)q k 
The important fact is that we can compute these quantities also using the replica trick. In particular, 



9 (1) = ^E^) 2 = i E e -^ [sll+ff[s21) ^E^ 2 = iT E ^E^ e -^" (so) < (87) 

S^S 2 i S a 



where the last equation is obtained writing Z 2 = lim„^ Z n 2 . If we now go on with the calculation along the lines 
of the previous paragraphs, introducing the overlap matrix Q ab , we get, 

q {1) = J DQ ab e~ NS ^ Q a = Q^e~ NS ^ = (88) 
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where is the saddle point value of the overlap matrix (since now on we will drop the suffix SP), and where we 
have exploited the fact that S(Q a b) is of order n, and therefore does not contribute when n — > 0. Of course, there is 
something wrong about this formula, since replicas 1 and 2 cannot be different from the others, while if we decided 
to call them 4 and 7, we would get a different result when Q a t is not replica symmetric. To understand this point 
we note that if the saddle point overlap matrix is not symmetric, then there must be other saddle point solutions 
with the same free energy, but corresponding to matrices obtained from Q a b by a permutation of lines and columns 
[9]. This is a general result: when a saddle point breaks a symmetry corresponding to a given transformation, all the 
points obtained by applying the transformation to that particular saddle point, are equally valid. This means that 
we must average over all these saddle points, and this is equivalent to symmetrize equation (88), obtaining, 

gW= lim 2 T Q at (89) 
n^o n(n - 1 z — ' 

y ' a>b 

This result is already telling us that there is a connection between q^> and the matrix of the overlap among replicas 
Qab- To go further, we can generalize (89), to get, 

v ' a>b 



A comparison with equation (86), gives for a generic function f(q) the relation 

lim 2 — - 

n->o n(n — 1) 



/ 



a>b 



and in particular choosing f(q) = 5(q — q'), we finally find the crucial equation connecting physics to replicas, 

lim — - 

n->0 n(n — 1) 



^)=lim^ 2 TTTE%-^)- ( 92 ) 

a>b 



This equation shows that the average probability that two pure states of the system have overlap q is equal to the 
fraction of elements of the overlap matrix Q a b equal to q. In other words, the elements of the overlap matrix (in the 
saddle point) are the physical values of the overlap among pure states, and the number of elements of Q a b equal to q 
is related to the probability of q. 

Before turning to a more precise computation for the SK model, it is useful to discuss some general properties of 
the overlaps. First of all, it is reasonable (and correct [9]) to assume that all the states have the same self overlap, 
q aa = qEA as in the spherical p-spin. Then, for any two states a and (3: 

< ^2{mf - mf) 2 = q aa + qpp - 2q a/3 = 2(q EA - q a[3 ) => q EA = ma,x{q a/3 } . (93) 

i 

Moreover, it is convenient to remove the trivial symmetry S —S that is present in the SK model and is reflected 
in P(q) = P(—q). This can be done by adding an infinitesimal magnetic field that will favor one of the two states a 
and —a related by the symmetry. Once this is done, the overlaps are all positive and one obtains a distribution P+(q) 
such that P(q) = (P+{q) + P+(—q))/2. In the following we will drop the suffix + and consider that the overlaps are 
all positive. 

Finally, it is useful to define the function 

I" 1 dx 
x{q) = / P(q')dq' G [0, 1] , ^ = P(q) . (94) 

As P(q) is positive, x(q) is an incresing function and we can define its inverse q(x). In particular we can write 

q {1) = f qP(q)dq = £ q^dq - J q(x)dx . (95) 



2. The Parisi solution of the SK model 



In the case of the SK model, one can again introduce replicas to average over the disorder, with again the result 

J dQ ab e NS ^ ; (96) 
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the explicit form of S(Q) can be found for example in [9]. The replica symmetric solution corresponds to Q a b = 
S a b + 5(1 — 5ab) ( m this case we allow also a finite overlap between replicas in different states). Remarkably, this 
solution predicts that q — for T > 1, while q ^ for T < 1, i.e. one finds a phase transition at T = T c = 1. 
However, the entropy becomes negative at low enough temperature, and as the SK model is formulated for discrete 
spins, this solution is uncorrect. 

If one uses a lRSB ansatz, as in (73), the situation is improved, in the sense that the entropy becomes negative at 
a much lower temperature and it is negative but small at T — 0. It is better to change notation, q — > q\ in (73) and 
to replace the zeros by qo to be more general. Then, within the lRSB ansatz, from (92) one has 
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P(q) = lira - - ^ [(n - m)S(q - q ) + (m - 1)6 (q ■ 



ra->0 n 



qi)] = m6(q - q ) + (1 - m)6(q - gi) , 



(97) 



meaning that two replicas have a probability m of being in the different states and a probability 1 — m of being in 
the same state. Note that this implies that m < 1 as we found in the previous computation. 

Parisi tried then to introduce another level of replica symmetry breaking, by assuming that replicas are split into 
mi blocks, and that inside each block they are further split into m 2 < m\ blocks. For n = 8, mi = 4 and m 2 = 2 the 
matrix Q reads: 
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The 2RSB solution has a better entropy, that becomes negative at lower temperature and is very small at T = 0. The 
P(q) reads 



f 



P(q) = lim [(n - m 1 )6(q - q ) + (mi - m 2 )5(q - q{) + (m 2 - 1)6 (q - q 2 )} 

rwO n — 1 

= m 1 6(q - q a ) + (m 2 - m{)6(q - qi) + (1 - m 2 )6(q - q 2 ) ■ 



(99) 



The positivity of P(q) requires < mi < m 2 < 1. 

Iterating the procedure produces the correct solution. One ends up with a sequence of numbers 1 < ttik < rriK-i < 
■ ■ ■ < m 2 < mi < n, each one corresponding to blocks with overlap qo < ■ ■ ■ < qx, and K — > 00. The equality is 
reversed in the limit n — > 0, m — < mi < m 2 < • • • < ttlk < 1 = m^ + i, and 



K 



(100) 



i=0 



becomes a continuous functions with support in [0, maxj gj]. The function x{q) is a piecewise constant function; in 
the limit K — > 00 it becomes an arbitrary function, that must satisfy only the constraint that x € [0,1], q € [0,1] 
and ^ > 0. The free energy becomes a functional of x(q). Therefore x(q) (or P(q)) is the order parameter of the 
transition for the SK model. 

The Parisi solution has the following physical interpretation: equilibrium states have self-overlap qx = qEA] they 
are arranged in clusters such that states inside a cluster have mutual overlap qx-i- But such clusters are arranged in 
other (super)clusters, and states belonging to the same (super)cluster have mutual overlap qx-2- Superclusters are 
arranged in supersuperclusters, etc. 

The complicated mathematical structure of the Parisi solution, once correctly interpreted, led to a number of non- 
trivial predictions, that we cannot review here; they are discussed e.g. in [8, 9, 50]. At present, it has been proven 
that the free energy of the Parisi solution is correct, i.e. it is equal to the true free energy of the SK model below T c 
[15]. This proof already took twenty years of efforts, and most of the interesting properties of the solution, even if 
confirmed by numerical simulations, have not been rigorously proven yet. 



D. Susceptibilities 



It is interesting to discuss at this point the behavior of the magnetic susceptibilities, that characterize the phase 
transitions we investigated up to now. We will first discuss the case of the ferromagnet, and then turn to the spin 
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FIG. 5: Magnetization m as a function of magnetic field B for a ferromagnetic system at T < T c , for finite TV and for TV — ¥ oo. 



glass. We will discuss explicitly the case of mean field models, but the discussion applies also to finite dimensional 
models, with minor modifications. 



1. The ferromagnet 

In a ferromagnet, the order parameter is the magnetization m(T, B; TV) = ^\ (Si) , or to be more precise 

m*(T) = lim lim m(T,B;N) . (101) 

The magnetization below T c is given by ±m* if B — > ± after TV — > oo. The behavior of m(T,B; TV) is sketched in 
Fig. 5. At finite TV, the magnetization is an analytic function of B; but in the limit TV — > oo, it becomes steep around 
B = where the singularity develops. The susceptibility 

x dm(T,B; TV) , x 

X (r,ff;TV)= ; (102) 

calculated at $ = is the slope of the curve at B — and diverges when TV — > oo at all T < T c . This is not the 
thermodynamic magnetic susceptibility, which is obtained by performing first the limit TV — > oo, and then B — > 0, and 
which is finite as one can check from Fig. 5. The quantity x(T, B=0; TV— >oo) is the susceptibility in the full Gibbs 
measure, which is infinite below T c because the Gibbs measure is unstable towards the decomposition in the two pure 
states with positive and negative magnetization. Note that since x(T, B=0; TV) is analytic at finite TV as a function of 
T, and it diverges for all T < T c , it follows that x(T, B=0; TV^oo) must diverge for T — > T + . 

We can compute the susceptibility at finite TV in the Gibbs measure (i.e. in absence of external field) using the 
fluctuation dissipation relation: 

x (T,B = 0;TV) = g = |^<5 i 5 J ) c (103) 

where (SiSj) c — (SiSj) — (Si) (Sj) is the connected correlation function. For large enough TV, we may think that 
the Gibbs measure is split in the two states a = ±, each with weight w a = 1/2. We have (Si) = 0. For the fully 
connected model, using the clustering property inside each state we get for i =^ j: 

(S i S j ) c = (SiSj) = ±[{Si) + (Sj) + + (Si)_ (Sj)_] = l -[(m*) 2 + (-m*) 2 ] - (m*) 2 , (104) 



and therefore 



X (T, B = 0; TV) = f3(N - l)[m*(T)] 2 + (3 , (105) 



which is divergent for TV —¥ oo in the low temperature phase. Note that this result is wrong, since for T > T c we 
obtain x(T, B=0; TV— ^oo) = /3, which does not diverge for T T+. 
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Indeed, from the mean field equation m — tanh[/3(23 + m)] (which holds only for N — > oo) we get, taking the 
derivative of the equation with respect to B: 

X =(l-m 2 )/3(l + X ) => x (r, g ;AT^oo) = r g^jg^L , (106) 

therefore for T > T c = 1, we get x(T, B;N ^ oo) = /3/(l — /3) which diverges for T — >• 1 + . In order to get this result 
from the fluctuation-dissipation relation we need to take into account that the clustering properties holds only for 
N — > oo; at finite JV there are corrections of order 1/N to (SiSj) c ; since there are A^ 2 terms and a factor of 1/N in 
Eq. (103), these 1/N corrections affect the finite part of x- Still, the calculation leading to (105) is correct for the 
divergent term 9 . 

In summary, the calculation of the leading term of \ m the Gibbs measure and at finite (but large) N can be done, 
using only the decomposition in pure states and the clustering property. This calculation shows that x(T',B=0;N) 
is divergent for N — > oo in the whole low temperature phase, which is to be expected from Fig. 5. But since \ is 
divergent for T < T c , it must diverge for T — > T+. In this way we identify the susceptibility that diverges at the 
phase transition. 



2. Spin glasses: linear susceptibilities 
For a spin glass in absence of external field and for a symmetric distribution of J, P(Jij) = P(—Jij), one has 



(SiSj) = for i 7^ j. In fact, for i ^ j, 



(SiSj) = Yw a mfm<? . (107) 



Due to the symmetry of the couplings, the probability (over the states a and the disorder) that the magnetizations 
mf and m" have the same or opposite signs are the same, and the average vanishes. The term (Si) (Sj) vanishes 
since (Si) and (Sj) have random signs. Then the total magnetic susceptibility is given by 



x=a\i-j i yAs,)''\=P{i -<,">). (io8) 



This is the susceptibility that one would measure if the system is prepared at equilibrium, then a small magnetic 
field is applied and the new equilibrium state is reached. A small magnetic field shuffles the free energy of the states, 
therefore in general the new equilibrium states in presence of a field will be very different from the old ones. As a 
result, in order to observe the susceptibility (108) one has to wait a very large time, the one needed for the system to 
reach the new equilibrium states. 

One might then be interested also in considering the linear susceptibility of a single equilibrium state a. This is the 
susceptibility one would observe if the system does not have time to escape from the original state after the magnetic 
field is applied. Using the fluctuation-dissipation inside the state a we get, under the assumption q aa = qEA, 

Xa=P^-^J2(Si)lj , XL* = $>aXa = £(1 " <teA) ■ (109) 

If P(q) is not trivial, i.e. it is not a delta function, q^ < qsA and the two susceptibilities arc different, in particular 
Xlr < X- This is very reasonable: once we switch on a small magnetic field B, the system will first respond by 
acquiring a small magnetization inside the state a, mr,R ~ xlrB; then it will escape from state a and find a better 
state in presence of B, which will then give a larger magnetization m ~ x& > m LR- This is an important property of 
the spin glass phase. 

Note that the time tjv needed to change state diverges with A^ if there is a true phase transition. Then, we can 
define (at finite N) the dynamical magnetic susceptibility 

^^^f-^^^*) ' (110) 



In a short range system, the discussion is very similar; clustering holds only for \i — j\ — > oo, but if the connected correlation is finite in 
this limit, the sum is dominated by the large values of \i — j\ and the result is the same as in (105). 
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where it is assumed that the system is in equilibrium in absence of field and the field is switched on at t = 0. For 
l<i< t n , the correlation function (Si(t)Si(0)) decorrelates inside the state a to which the initial configuration 
begins, (Si(t)Si(0)) ~ J2 a w a (Si) a (Si) a . Hence we have 

X (l « t « T N ) = P - 1 E W ° Wc^j = XLR ■ (HI) 

On the other hand, for t » tn, the system decorrelates completely in the full Gibbs measure: (Si(t)Si(0)) <~ (Si) (Si), 
and x(t ^ t n) — X, the equilibrium susceptibility. Recalling that tn diverges with N, we get 

Jim lim x(t) = X , 

iV->oo t-S-oo (112) 

lim lim x(*) = Xlr , 

t^oo N— 5-co 

The behavior of x(£) on a time scale t that diverges with N is more complicated; its discussion is behind the scope of 
these notes. 

In any case, for spin glasses the magnetic susceptibilities are finite at the transition: for a continuous transition, 
both x and Xlr show a cusp at T c , when qEA becomes non-zero. For a discontinuous lRSB transition, Xlr jumps at 
Td where states appear. However, from (97) with q = we have = (1 — m)qi = (1 — m)qEA', then, \ is analytic 
across Td because m = 1 and = 0, and has a cusp at when ra/1. 



3. 27te siatac spin glass susceptibility 



The diverging susceptibility is more complicated in the spin glass case. In order to keep the discussion more general, 
we will avoid from now on to perform explicitly the average over the disorder and consider a single (large enough) 
sample, using the fact the susceptibilities are self-averaging quantities. In this way the following discussion can be 
extended straightforwardly to glassy systems without quenched disorder. 

The analog of the magnetization for a spin glass system is the self-overlap qEA — <Zaa, that becomes non-zero when 
states appear. As we discussed in section II B 3, to compute qEA we should put a coupling between replicas in order 
to force them to be in the same state. We consider two replicas and choose a coupling SH = —^J2i^l^i- Then (101) 
becomes 

i 

It is worth note at this point that the quantity above is really the self-overlap of the states and not q^ (which is the 
thermodynamic average of q a [j). The reason is the following. Suppose we are in a phase where the P(q) is not trivial 
(for instance, we consider the SK model at T < T c or the p-spin model for T < T K ). In this case, the Gibbs measure 
is dominated by the set of states that have the lowest intensive free energy. The fluctuations in the weight are due to 
the 1/N corrections to the free energy We can write f a = f min + Af a /N, therefore 

It can be shown [9] that these weights have finite fluctuations, and although the number of states with finite weights 
is infinite for N — >• oo, the number of states that are needed to cover a fraction 1 — e of the total weight is finite for 
any e. The partition function of two replicas with the coupling discussed above can be written as 

Ze = e -2PN fmin ^2 e -P(Af a+ A Me 0eN qa , (u5) 

a/3 

Clearly, since the number of relevant terms in the sum is finite, and the weights are finite, and qEA = qaa > qa^p, 
the coupling term for any e > makes the contribution of the terms with a — f3 exponentially bigger than the one of 
Therefore the replicas are in the same state and the average overlap is given by qEA- On the contrary, q^ is 
related to a full average in the Gibbs measure in absence of any coupling 10 . 



By a similar argument one can show that P(q) = S(q — qEA) f° r an y e > 
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The susceptibility associated to qEA is therefore 11 (at finite N and e — > 0, when the replicas are uncoupled): 



XSG 



de ~ N 



(116) 



Performing the decomposition in pure states and using the clustering property it is easy to show that 



XSG = flN 



afi 



w a W/3(q a p) 2 - ^2 WaWpw^wsqapq-fS 

a/37<5 



pN[q^-(q^ 



h 2 l 



(117) 



Then xsg is divergent in the spin glass phase where the P(q) is not trivial and q^ 0. This happens for T < T c in 
the SK model (and in all models with a second order spin glass transition), and for T < Tk in the spherical p-spin 
model (and in all models with a discontinuous transition). In all these cases the partition function is dominated by 
the states of lowest free energy, so that an infinitesimal coupling is enough to force two replicas to be in the same 
state:. 



4- The dynamic spin glass susceptibility 



The static spin glass susceptibility defined above does not diverge between T d and Tk in the discontinuous case, as 
it should, because it is a thermodynamic quantity and we showed that the thermodynamics has no singularity at T^. 
How can we obtain a susceptibility that diverges at the clustering transition? For this, we would like to probe each 
state separately and not the full Gibbs measure. 

Similarly to what we did in Eq. (110), the solution is to consider a dynamical susceptibility. We define it as follows: 



Xs G (t) 4E[( S <(^(W) S i( fl )) - (Si(t)Si(0)) (S,m(0))} . 



N 



(118) 



Note that xsg(0) = 0. Again, the time needed to change state, tm, diverges with N. For t » tm the system 
decorrelatcs in the full Gibbs measure and 



XsG (t » TN ) = | Yl [ < S * 5 i> <W - W to> }=XSG, 



N 



(119) 



so that it reduces to the static spin glass susceptibility, and it is finite between T K and T d . 

In the region 1 <C t <C tjv, the system is only able to decorrelate within one state, and the dynamic susceptibility is 



X sg{1 « t « r N ) = | ]T [ ]T w a (SiSj) a (SiSj) a -Y,w a (Si) a (Si) a £ wp (Si) p (Sj) p ] 



N 



pN[J2w a q 2 aa -J2w a w q 2 a p] = pN[{q EA ) 2 - q (2) ] = XSG,LR ■ 



(120) 



Now, xsg,lr is divergent also in the discontinuous lRSB case between Td and Tk, where qEA ^ and q^ = 0. 
In summary, we have xsg(0) = and 

Jim lim Xsc(t) = Xsg , 
lim lim xsc(t) = Xlr.sg ■ 

t->oo AT-s-oo 



(121) 



Note that in the cluster phase Tk < T < Td, Xsc{t) has a very peculiar behavior 12 : it grows up to a very large 
(<~ N) value for t ~ tm, and then decays back to a finite value for really large times t ^> tjv- On the contrary, in the 
thermodynamic spin glass phase xsc{t) is of order N for all times t » 1. 



The factor /3 is sometimes omitted or replaced by /3 2 in the literature. 

This has been recently observed in structural glasses; usually XSgW ' s called X4(t) in the literature. 



32 



E. Exercises 



1. Dynamics of the fully connected Ising model - Write a program to simulate the Metropolis dynamics of the 
fully connected Ising model, Eq. (12). The dynamics is denned as follows. At each step try to flip a random spin, 
Si — > —Si. Draw a random number x uniformly in [0, 1] and accept the move if x < exp[— f3(H[S'} — H[S])}, 
where S' is the configuration with Si flipped. (Hint: in addition to the spins Si, keep in memory the total 
magnetization M — J^. Si updating it at each step. Use it to compute the energy change.) Using the program, 
simulate the time evolution of a small number of spins in the low temperature phase. Check that the system 
jumps from one state to the other and try to observe the scaling of the persistence time with N. 

2. The Random Energy Model - The p-spin Hamiltonian (47) for Ising spins is, for a given configuration S, a 
Gaussian random variable with zero average. 



• Show that the covariance H[S]H[S'] = ±NQ(S, S') p , where Q(S, S') = -fa £. S.S', is the overlap of S, S'. 

• Deduce that for p — > oo the energy of a configuration is a Gaussian random variable with variance N/2 
and that the energy of different configurations are uncorrelated (this is trivial). 

Therefore in the large p limit the p-spin converges to the Random Energy Model (REM): there are 2 N configu- 
ration with energies Ei that are Gaussian and independent, with zero average and variance N/2. 

• Compute the average number of states that have energy E, call it £l(E); compute the complexity E(e) = 
jj logfl(iVe) (the result is S(e) = log 2 — e 2 ). Show that the complexity vanishes for e < e^; compute ex- 

• The previous result implies that for e > en the number of states is very large; conversely, for e < &k 
the probability of find a state is exponentially small (S < 0) and we can assume that there are no states. 
Define then the partition function as 



2« 

Z ( T ) =Ve~ Bl/T ~ / CK dEQ(E)e- E / T . 

J-N\e K \ 



(122) 



Compute the free energy f(T) via a saddle point. Show that there is a phase transition at a certain 
temperature Tk, in particular that the second derivative of f(T) (the specific heat) has a jump. 

The REM has exactly the same phenomenology of the p-spin, except for the fact that in the REM each con- 
figuration is a state. Therefore T d does not exist since states are stable at all temperatures, and moreover the 
internal entropy S(T) is identically zero. See [9, 60] for details. 

3. Zero-temperature complexity of the p-spin model - Consider the zero-temperature limit of (76). Show 
that for (3 — > oo it has a finite limit at fixed fi = fJm and with q = 1 — aT, with a of order 1, given by 

^irsb = - -(pa + + log-?— . (123) 
4 2fi a + 

Deduce that a satisfies the equation a 2 + net — 2/p = 0, and compute its positive solution (recall that q < 1). 
Now show that 

e(^) = a lim f(m,T) = B^^rsb] , 

/3->oo,/3m=(U (124) 

and compute their explicit expressions as function of fi. Ch eck t hat the physical region corresponds, for p = 3, 
to fj, € [0.577,0.884], that the threshold energy is e t h = — -y/4/3 ~ —1.155 and that the ground state energy is 
e ~ —1.172. Show that in general e t h = -y2(P — 

4. Complexity of the 3-spin model - Starting from Eq. (76), and using (60), compute £(/) for the spherical 
3-spin at different temperatures, e.g. T = 0.63 > T d , T d > T = 0.6 > T K , T = 0.5 < T K (it can be done by 
Mathematica 13 or writing a program in C/Fortran/...). There are two possibilities to solve the equation for q: 
i) use that it is cubic and write the explicit solution; ii) write it as q = f(q) and solve it by iteration. 



A Mathematica sheet that does the job can be downloaded from http://www.lpt.ens.fr/-zamponi, section "Teaching" 
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III. DILUTED MODELS AND OPTIMIZATION PROBLEMS 



A. 



Definitions 



1. Statistical mechanics formulation of optimization problems 



The theory of computational complexity [38] establishes a classification of constraint satisfaction problems (CSP) 
according to their difficulty in the worst case. For concreteness it is convenient to introduce three problems we shall 
use as running examples in the following: 

• XORSAT. Find a vector x of boolean variables satisfying the linear equations Ax = b (mod 2), where each 
row of the 0/1 matrix A contains exactly k non-null elements, and b is a given boolean vector. 

• q-coloring (g-COL). Given a graph, assign one of q colors to each of its vertices, without giving the same color 
to the two extremities of an edge. 

• ^-satisfiability (fc-SAT). Find a solution of a boolean formula made of the conjunction (logical AND) of clauses, 
each made of the disjunction (logical OR) of k literals (a variable or its logical negation). 

Each of these problems admits several variants, for instance 

• Decision: one has to assert the existence or not of a solution, for instance a proper coloring of a given graph. 

• Sampling: one has to sample the solution according to a given distribution (e.g. uniform over all the solutions), 
and estimate for instance the total number of solution. 

• Optimization: one has to discover optimal configurations; these are solutions, if present, or otherwise con- 
figurations minimizing the number of violated constraints: for instance colorings minimizing the number of 
monochromatic edges. 

The decision variant of the three examples stated above fall into two distinct complexity classes: fc-XORSAT is in 
the P class, while the two others are NP-complete for k,q > 3. This means that the existence of a solution of the 
XORSAT problem can be decided in a time growing polynomially with the number of variables, for any instance 
of the problem; one can indeed use the Gaussian elimination algorithm. On the contrary no fast algorithm able of 
solving every coloring or satisfiability problem is known, and the existence of such a polynomial time algorithm is 
considered as highly improbable. 

A general statistical mechanics formulation of a CSP is the following. One is given N variabls Cj, i = 1, ■ ■ ■ ,N, 
that might be Ising spins (Boolean variables) or Potts spins, real numbers, or more complicated variables. Then, one 
is given a set of M constraints labeled by a = 1,-- • , M; each constraint involves a certain set of variables that is 
denoted by a a — (c^ , • • • , o~ i K ). The constraint is a function ^} a {pa) that is when a a does not satisfy the constraint 
and 1 otherwise. In the following we will use also the words "clause" or "test" to denote a constraint. 

The CSP consists in finding a configuration a verifying the condition Y\ a i>a{^a) = 1; the number of solutions is 



The optimization version of the same problem is obtained by replacing the hard constraint ip a with a "soft" one, 
ipP = if the constraint is not satisfied and ip@ = 1 otherwise. This corresponds to a problem at finite temperature 
j3 such that the energy of a violated constraint is equal to 1. The optimization problem is that of finding the ground 
state, while the decision problem amount to decide whether a ground state with zero energy exists. In both cases we 
are interested in working at zero (or very small) temperature. 

The three illustrative examples presented above admits a simple representation in this formalism: 

• fc-XORSAT. The degrees of freedom of this CSP are boolean variables that we shall represent, following the 
physics conventions, by Ising spins, S G {— 1,+1}. Each constraint involves a subset of k variables, S a = 
(Sji , . . . , Sjk), and reads ip a (S a ) = ■ • • £»* = Ja), where here and in the following I(-) denotes the indicator 
function of an event and J a G {— 1,+1} is a given constant. This is equivalent to the definition given in the 
introduction: defining x i} b a G {0,1} such that Si = (— l) x< and J a = (— l) ba , the constraint imposed by ip a 
reads x { \ + • • • + x { k = b a (mod 2), which is nothing but the a'th row of the matrix equation Ax = b. The 
addition modulo 2 of Boolean variables can also be read as the binary exclusive OR operation, hence the name 
XORSAT used for this problem. 




(125) 



a 
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FIG. 6: An example of factor graph. The neighborhoods are for instance di = {a, b, c, d} and di\a — {b, c, d} 

• q-COL. Here a G {1, . . . ,q} is the set of allowed colors on the N vertices of a graph. Each edge a connecting 
the vertices i and j prevents them from being of the same color: \p a {ai 1 <jj) = ^ dj). 

• fc-SAT. As in the XORSAT problem one deals with Ising represented boolean variables, but in each clause the 
XOR operation between variables is replaced by an OR between literals (i.e. a variable or its negation). In 
other words a constraint a is unsatisfied only when all literals evaluate to false, or in Ising terms when all 
spins S a = , . . . , Sik) involved in the constraint take their wrong values, that we denote J a = (Jji , . . . , J,*): 

MSa) = I(5 a V J a ) = 1 - l(S a = J a ). 

In all these cases we can define a Hamiltonian H = X^ff=i E a {<j a ), such that E a = if the constraint is satisfied and 
E a = 1 otherwise. Then, ip& = exp(-~f3E a ) and ip a corresponds to f3 — > oo. For instance, in the case of the g-COL, 
the Hamiltonian corresponds to a Potts antiferromagnet H = Y^Uj) S{cri,aj). In the following we will often drop the 
explicit dependence on (3 to lighten the notation. 

Factor graphs [61] provide an useful representation of a CSP. These graphs (see Fig. 6 for an example) have two 
kind of nodes. Variable nodes (filled circles on the figure) are associated to the degrees of freedom <jj, while constraint 
nodes (empty squares) represent the clauses ip a . An edge between constraint a and variable i is drawn whenever tp a 
depends on <7j. The neighborhood da of a constraint node is the set of variable nodes (ii, ■ ■ ■ 7 iK a ) that appear in 
a. Conversely we will denote by di the set of all constraints (a ll a 2l ■ ■ .) in which variable i is involved. We shall 
conventionally use the indices . . . for the variable nodes, a, b, . . . for the constraints, and denote \ the subtraction 
from a set. The graph distance between two variable nodes i and j is the number of constraint nodes encountered on 
a shortest path linking i and j (formally infinite if the two variables are not in the same connected component of the 
graph). 

Note that if the constraints only involve two variables i and j (as for instance in the SK model, in 2-XORSAT or 
in (j-COL), then each constraint is equivalent to a link connecting i and j and the factor graph reduces to a standard 
graph. 

2. Random optimization problems: random graphs and hypergraphs 

The common notion of computational complexity, being based on worst-case considerations, could overlook the 
possibility that "most" of the instances of an NP problem are in fact easy and that the difficult cases are very rare. 
Random ensembles of problems have thus been introduced in order to give a quantitative content to this notion of 
typical instances; a property of a problem will be considered as typical if its probability (with respect to the random 
choice of the instance) goes to one in the limit of large problem sizes. Of course the choice of a distribution over 
the instances is arbitrary and could not reflect the properties of the instances that relevant for a given practical 
application. Still, the introduction of a distribution over the instances allows to formulate the problem in terms of 
the statistical mechanics of a spin-glass-like model. We will see that this formulation provides important insight in 
the properties of difficult instances of these problems. 

An instance of a random CSP is defined by two objects: the underlying factor graph and, as in fully connected 
models, the set of couplings J appearing in the constraints (e.g. the right hand side of an equation in XORSAT). 
Both the factor graph and the couplings can be taken as random variables to define a probability distribution over 
instances. Recall that we have N variable nodes and M constraint nodes. In the statistical mechanics approach 
we will be interested in the thermodynamic limit of large instances where N and M both diverge with a fixed ratio 
a = M/N. 

Among many possible ensembles of graphs, two have been investigated in great detail: 



35 



Temperature 




c d C K c s Connectivity 



FIG. 7: Sketch of the phase diagram in the coloring problem at finite temperature (from [63]). At Td, the system falls out of 
equilibrium ("dynamic" transition). At Tk the system undergoes a "static" glass transition, eas represents the ground state 
energy, which is nonzero above the connectivity c s . 

• Random regular graphs (or fixed connectivity) [62]: each constraint involves k distinct variables (k is a free 
parameter for (XOR)SAT and k = 2 for COL), and each variable enters in exactly c different constraints. 
Uniform probability is given to all graphs satisfying this property. Note that one must have Mk = Nc, i.e. 
c = kM /N = ka. 

• Erdos-Renyi random graphs [61]: For each of the M clauses a a fc(> 2)-uplet of distinct variable indices 
(£*,..., i*) is chosen uniformly at random among the ( , ) possible ones. For large AT, M the degree of a variable 
node of the factor graph converges to a Poisson law of average ak. To compare with regular graphs we shall use 
the notation c = ka for the average connectivity. 

In principle one might allow also the connectivity of the constraints to be a random variable but we do not discuss 
such case here. Note that the limit c — > oo with a proper scaling of the couplings gives back the fully connected 
model. In this limit, the fluctuations of c in Erdos-Renyi graphs can be neglected and the two ensembles of graph are 
equivalent. 

Random (hyper)graphs have many interesting properties in this limit [61]. In particular, picking at random one 
variable node i and isolating the subgraph induced by the variable nodes at a graph distance smaller than a given 
constant L yields, with a probability going to one in the thermodynamic limit, a (random) tree. This tree can be 
described by a Galton- Watson branching process: the root i belongs to I constraints, where I is a Poisson random 
variable of parameter ak (or I — ak in the fixed connectivity case). The variable nodes adjacent to i give themselves 
birth to new constraints, in numbers which are independently Poisson distributed with the same parameter. This 
reproduction process is iterated on L generations, until the variable nodes at graph distance L from the initial root i 
have been generated. 

The algorithm to construct Erdos-Renyi graphs is trivial, since it is given by the definition. Fixed connectivity 
graphs can be constructed as follows: first one attach to each variable node a number c of links, thus obtaining Nc 
links. These links have to be connected to the Mk = Nc links attached to constraint nodes. To do this, one simply 
numbers the links from 1 to Nc and then pick up a random permutation of these numbers in order to decide which 
of the variable links has to be attached to a given constraint link. The resulting graph however might have variables 
that are connected twice or more to the same node. In this case the permutation is discarded and a new one is picked 
until a good graph is reached. In practice the probability of such event is small if N is large and c not too large, so 
that the procedure converges quickly to a good graph. 

3. Connectivity-temperature phase diagram 

In the previous sections we discussed the appearance of an exponential number of states as a function of the 
temperature, therefore focusing on the free energy / of the states. In the case of a random CSP, the control parameters 
are the temperature T and the connectivity of the underlying graph c (or equivalently the ratio of constraints per 
variable a = c/k), and we are mostly interested in the T — limit. 
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We anticipate that the typical phase diagram of a random CSP looks like the one in figure 7. The fully connected 
case corresponds to the limit of large c. In this case, as a function of temperature, we have shown that there is a 
"clustering" transition at Td and a spin glass transition at Tk < Tj. Both Td and Tk depend on c and they vanish 
at some given values of c, Cd and ck respectively. Therefore we expect that for connectivities below c<j, there is a 
unique cluster of ground states of zero energy (solutions), while for c> Cd these ground states are arranged in many 
clusters, each cluster being characterized by the number of solutions A/in belonging to it. We call "internal entropy" 
of a cluster the logarithm s = log A/i„, and again we call complexity £ the logarithm of the number of clusters. 

In the case of the p-spin model we used the modified partition function (58) in order to compute the complexity. 
We now comment briefly on the way this general formalism is applied to constraint satisfaction problems (CSP). 
First we split the free energy of a state in f — e — Ts, and we introduce a complexity as a function of e and s, 
S(e, s) — jf logA/"(e, s). Then we rewrite (58) as 

Z m = J deds e N^(e,s)-Pm(e-Ts)] (126) 

It would have been useful to introduce two parameters, one conjugated to energy and the other to entropy; however, 
in general the partition function above, that contains only m, is easier to compute (e.g. using replicas as we already 
discussed). Therefore, since we have only one parameter m, we cannot reconstruct the full E(e, s) via a double 
Legendre transform. The point we want to discuss now is that depending on how we take the T — > limit, we can 
obtain information on the entropy or the energy of the states. 

For a satisfiable instance of a CSP, the law defined in (125) is the uniform distribution over the solutions of the 
CSP, and the partition function counts the number of such solutions. If we take the limit (3 — > oo at fixed m of (126), 
the term exp[— N(3me] restricts the integral to e — and we get 

Z m = J dse ms(s)+ms] ; (127) 

where S s (s) = S(e = 0, s). We can define a "free entropy" 

S(m) = -J- \ogZ m = max[£ s (s) + ms] , (128) 
N s 

from which we can compute S s (s) by a Legendre transform. This "entropic" method allows to obtain information on 
the distribution of the entropies of the zero energy states, when they exist. This approach is however ill-defined for 
unsatisfiable instances. In this case Eq. (126) gives 

S(m) = max[S(s, e) + m(s - (3e)} . (129) 

If one takes now the limit (3 — > oo in the region where e > 0, the complexity term becomes subdominant and we do 
not get much information. 

In order to obtain a meaning result, one has to take the limit m — > and (3 — > oo simultaneously, in such a way 
that the product (3m, usually denoted y, remains finite. One therefore obtains 

S e (y) — max[S e (e) — ye] , E e (e) = max S(s, e) . (130) 

e s 

This "energetic" cavity approach allows to obtain the distribution S e (e) of the energies of the states (irrespective of 
their entropy) and allows in particular to compute the ground state energy of the problem [64, 65] . 

We expect (and it can be verified by the explicit solution of these models) that the static value m s (T, c), that results 
from optimization of the free energy, will be 1 in the "liquid" (or "paramagnetic") high temperature phase. Then 
m 8 (T = 0, c) = 1 on the T = line below Cd- On increasing c above Cd, m s (T = 0,c) will become smaller than one 
and decrease, still remaining finite. Only at c = c s , m s (T — 0,c s ) = 0, signaling the transition to the unsatisfiable 
phase. Above c s , m s (T, c) vanishes linearly in T for T — > 0, defining the corresponding y. For this reason, the entropic 
cavity method is mostly appropriate below c s , while the energetic cavity method is mostly appropriate above c s . 

Both methods can be used to detect c s . Indeed, c s is the point where solutions at e = disappear, hence 
max s E s (s) = max s E(e = 0, s) vanishes at c s . In the entropic method, S s (s) is computed directly and its maximum 
corresponds to m — 0. In the energetic method, in the satisfiable region, we can take a second limit y — > oo 
(after (3 — > oo at fixed y — (3m) to concentrate on the states with e = 0. The complexity computed in this limit 
is S e (0) = max s S(s,e = 0), so that it gives back the maximum of the entropic complexity. In other words the 
procedure y — > oo after (3 — > oo is equivalent to perform the entropic computation with a Parisi parameter m = 0, 
i.e. to weight all the pure states in the same way, irrespectively of their sizes, which is the correct way to determine 
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the satisfiability threshold c s . The determination of the clustering transition is more subtle. A calculation of Cd 
using the energetic method was first performed in [64, 65] ; this corresponds to the appearance of a solution of the 
1RSB equations with m — 0. Later it has been shown that the calculation of Cd at m = 1 can be performed using 
the entropic cavity method [66]; this is the equilibrium clustering threshold, which can be related to a dynamical 
transition at zero temperature as a function of c. 



B. XORSAT: clustering and SAT/UNSAT transition 

Before discussing the cavity method, we would like to analyze in some detail the XORSAT problem. We will focus 
on the XORSAT problem defined on Erdds-Renyi random graphs, and with couplings J a = ±1 (equivalently b a = 0, 1) 
with probability 1/2. We will refer to this distribution as "random XORSAT" in this section. Similarly to the fully 
connected p-spin model, random XORSAT can be fully analyzed at zero temperature and the phase diagram can be 
understood in detail. These results have been originally derived in [67, 68]. A very nice review on phase transition in 
optimization problems, from which this section is reprinted, is [37]. 



1. Bounds from the first and second moments methods 

Let Af be a random variable taking values on the positive integers (it will be the number of solution of an instance 
drawn from the assigned probability distribution), and call p/s its probability. We denote by (Af) and {Af 2 } the first 
and second moments of Af (assumed to be finite) , and write 

Psat = p(Af > 1) = P« = 1 ~Po (131) 

W=l,2,3,... 

the probability that Af is not equal to zero. Our aim is to show the inequalities 

{ ^<p(Af>l)<(Af) . (132) 
The right inequality, call 'first moment inequality', is straightforward: 

(Af) =Y J MPN= Y,*fp*> Yl PX=pW> !)■ (133) 

A/" SS>1 7V>1 

Consider now the linear space made of vectors v = (vo, Vi, V2, ■ ■ ■} whose components are labelled by positive integers, 
with the scalar product 

v-V = Y,P* v * v Af ■ ( 134 ) 

A/" 

Choose now vm = Af, and v' = 0, v'j^ = 1 for Af > 1. Then 

w = (Af 2 ) , v • v' = (Af) , v' • v' = P {Af > 1) . (135) 

The left inequality in (132) is simply the Cauchy-Schwarz inequality for v,v': (v • v') 2 < (v • v) x (v' • v'). If 
Af represents the number of solution of a given XORSAT instance, we can use these bounds to obtain bounds on 
Psat — p(Af > 1). 

Figure 8 shows the probability that a random 3- XORSAT formula is satisfiable as a function of a for increasing 
sizes N. It appears that formulas with ratio a < a c ~ 0.918 are very likely to be satisfiable in the large N limit, while 
formulas with ratios beyond this critical value are almost surely unsatisfiable. Use of the first and second moment 
inequalities (132) for the number Af of solutions provides us with upper and lower bounds to the Sat/Unsat ratio a s . 

To calculate the first moment of Af remark that an equation is satisfied by one half of the configurations. When we 
average over the possible choices of the second member of the equation, we have that two equations are satisfied simul- 
taneously by 1/4 of the configurations, and M equations are satisfied simultaneously by 1/2 M of the configurations. 
The average number of solutions is thus 2 N /2 M , from which we get 

P S AT<Af=2 N(1 - a K (136) 
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FIG. 8: Probability that a random 3-XORSAT formula is satisfiable as a function of the ratio a of equations per variable, and 
for various sizes N. The dotted line locates the threshold a s ~ 0.918. 



The first moment vanishes for ratios larger than unity, showing that 

a c < 1 . (137) 

This upper bound is definitely larger than the true threshold from the numerical findings of Figure 8. Close to a = 1 
formulas are unsatishablc with probability one (when N — > oo), yet the average number of solutions is exponentially 
large! The reason is that the average result is spoiled by rare, satisfiable formulas with many solutions. 

As for the lower bound we need to calculate the second moment M 2 of M. We use here the representation in terms 
of bits and we denote by X — {x\, • • • , xjv} the configuration of the N bits. For a given instance, N = J2x = 
Tx iIa-"-a(^0' wnere IPO is the indicator function of the event that X is a solution to that instance, and I a (X) is 
the indicator of X being a solution to clause a. As equations are independently drawn 



= j2w(Y) = y.huxw) = ]rn = Ep(*> y ) M ( 138 ) 

X,Y X,Y a X,Y a X.Y 

where the sum is carried out over the pairs X, Y of configurations of the N variables, and p(X, Y) is the probability 
that both X and Y satisfies the same randomly drawn equation. The latter can be easily expressed in terms of the 
Hamming distance (per variable) d between X and Y, defined as the fraction of variables having opposite values in 
X and Y. The general expression for fc-XORSAT is 14 

P{d) = \(l + {l-2df) . (139) 

Going back to (138) we can sum over Y at fixed X, that is, over the distances d taking multiple values of with the 
appropriate binomial multiplicity, and then sum over X with the result 

<^ 2 > = 2N E ( Nd) Pid)M = ° MN m a)) (140) 

in the large N limit, where 

A(d, a) = log 2 - dlog d - (1 - d) log(l - d) + a Inp(d) . (141) 



The equation is satisfied by X with probability 1/2. Given that X is a solution, it is satisfied also by Y if the number of variables 
entering the equation and taking opposite values in Y as in X is even. By definition of d the probability (over its index i) that a variable 
takes different value in X and Y is d. To construct the equation one has to extract k independent indexes i, then the probability that the 
extracted variables are equal in X and Y is (fyd°(l — d) k ~° . Similarly the probability to have two different variables is (^)d? (1 — d) k ~ 2 , 
and so on. The sum of all even numbers gives expression (139) for p(d). Sec [37] for details. 
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The absolute maximum of the function A(d, a) is located in d* = \ when a < a 2 — 0.889, and d* < \ when a > a 2 . 

In the latter case N 2 is exponentially larger than AT 2 , and the second moment inequality (132) docs not give any 

information about Psat- In the former case M 2 and A" 2 are equivalent to exponential- in- N order. It is shown in [37], 
by computing the saddle point corrections to (140), that their ratio actually tends to one as N — > oo. We conclude 
that formulas with ratios of equations per variable less than a 2 are satisfiable with high probability in the infinite size 
limit, or, equivalently, 

a c > a 2 ~ 0.889 . (142) 

Unfortunately the lower and upper bounds do not match and the precise value of the threshold remains unknown at 
this stage. We explain in the next section how a simple preprocessing of the formula, before the application of the 
first and second moment inequalities, can close the gap, and shed light on the structure of the space of solutions. 

2. The leaf-removal algorithm 

We now sketch how clustering in the random 3-XORSAT problem may be analyzed rigorously. The techniques used 
are borrowed from the analysis of algorithms, and probability theory. 

Let S be a randomly drawn 3-XORSAT system. A crucial remark is that removal of an equation containing a 
single-occurrence variable preserves the satisfiability, or unsatisfiability of S. Consider for instance x\ + x 2 + £3 = b. 
Clearly if x\ appear only in this equation, for any value of x 2 and x$ the equation can be satisfied by choosing the 
appropriate value of x\. Therefore if the system of the remaining solution is satisfiable, the same will be for the full 
system. 

This procedure can be iterated to further simplify S until no single-occurrence variable is left. The output of the 
procedure is S', the largest subsystem of S where variables appear at least twice 15 . In the following, we analyze: 

• the procedure which allows us to extract system S' from S; 

• the statistical properties of S' , and their consequences in terms of satisfiability for <S; 

• how the solutions of the original system S can be reconstructed from the solutions of S', and how clustering 
emerges from this reconstruction process. 

Intuition on single-occurrence variable removal is made easier once we introduce the graphical representation of 
Boolean systems in terms of factor graphs discussed above. 

Removal of a single-occurrence variable and of its attached equation from the system S is equivalent to removal 
of a leaf variable and its attached constraint from the factor graph. Removal of a leaf may "uncover" vertices and 
produce new leaves. The process may therefore be iterated well after all leaves present in the original hypergraph 
have been removed. How can we quantitatively track the reduction of the hypergraph as removal goes on? Let us 
call step of the procedure the action of choosing a leaf and removing it together with its constraint, and f-vertex a 
vertex of connectivity I, i.e. which appears in £ distinct constraints (with I > 0). Removed leafs will be considered 
as 0- vertices in order to conserve the total number of vertices. The number of ^-vertices after T steps is a stochastic 
variable, depending upon the system S and the sequence of leaves removed by the procedure, denoted by Ne(T). 
Obviously, for all T, 

Y,Ni(T) = N and ^ I N e {T) = 3 (M - T) , (143) 

e>o e>o 

as a result of the conservation of the total number of vertices and constraints respectively. Removal goes on as long as 
N\(T) > 1. Denote N(T), and call population the set {Ne(T),£ > 0} of all numbers of ^-vertices. Knowledge of the 
population is generally not sufficient to unambiguously determine the hypergraph produced by T steps of the removal 
procedure. Indeed, many hypergraphs have the same population N(T). An essential observation is, however, that the 
output of T steps of the removal procedure is equally distributed among the set of all hypergraphs having population 
N(T). In other words, a complete statistical information about hypergraphs is obtained from the knowledge of the 
population. 



From this definition, it is clear that 5' is unique and independent of the order in which equations with single-occurrence variable arc 
removed. 
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Let us now see how this population IM(T) is modified during step T — > T+ 1. The variations of the are stochastic 
variables due to the randomness in the system S and the choice of the leaf to be removed, with conditional expectations 
with respect to N(T) given by 

E[7V £ (T + 1) - N t {T)\ti(T)] = -5 n + Seo + 2 W+1 (T) - 2 Pl (T) . (144) 

When a constraint is removed, a 1-vcrtcx disappears (— Sn term in (144)) to become a 0- vertex (Seo). This constraint 
is attached to two other vertices. The numbers £, £' of occurrences of each of these two vertices are distributed with 
probability 

and are diminished by one once the constraint is taken away 16 . From (144), the expectation values of Ng vary by a 
quantity of the order of unity at each time step, and the expected densities of ^-vertices, defined through 

nt{T) = -LJUl , (146) 

undergo changes of the order of 1/N only. We are naturally led to conclude that densities are function of a much 
longer "time scale", t — T/N. In other words, given t € [0,1], ng([tN]) and ne([tN] + o(N)) are equal to within 
o(l) terms as TV tends to infinity, and we denote by ne(t) their common limit. Assuming that the densities ne(t) are 
diffcrcntiable functions of this time, we obtain from evolution equation (144) a set of coupled first order differential 
equations 17 , 

^(t) = -S n + Seo + 3(q 2 _ t) [{£ + 1) n e+ i(t) - *n<(t)] . (147) 

Resolution of these equations require to assign the initial conditions. At time t = 0, that is, prior to any removal of 
constraint, densities are Poisson distributed with parameter 3 a, 

n e (0) = e- 3a{ ^f- ,V£, (148) 

as we discussed above. 

Solutions of equations (147) with initial conditions (148) read 



no(t) - e- 3ab ^ 2 +3ab(t) 2 (l - b(t)) , 
m{t) = 3ab(t) 2 (e~ 3ab ^ 2 +6(t)-l) , 



si 



(3ab(t) 

where 

b(t) = ( 1 - f ) . (150) 



n e (t) = e- 3 ^ 2 £^LL , V £> 2 , (149) 



^1/3 



a 

These equations are valid as long as n\ is positive, since the procedure stops when no leaf is left. The density ni(t) 
of 1-vertices is showed on Fig. 9 for various initial ratios a of equations per variable. The shape of the curve reflects 
the competition between two opposite effects: the annihilation of 1-vertices due to leaf removal and their creation as 
a result of reduction of 2-vertices to 1-vertices. At small ratios e.g. a = 0.7, the former mechanism dominates and n\ 
is a decreasing function of time. For slightly larger ratios, the density m does not monotonously decrease with time 
any longer, but still vanishes at time t* = a i.e. when no constraint is left. 



The probability pi of picking a variable, say, x\ , is equal to the number t\ of its occurrences divided by the total number of occurrences 
of variables in the system, 3(M — T). The probability of picking any variable with occurrence I is pi multiplied by the number Ng(T) 
of such variables, hence (145). Equation (143) ensures that pi is properly normalized. 

The left hand side of (144) reads N[n e (t + 1/N) - n e (t)] = dn t /dt + 0(1/N) through a Taylor expansion. 
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FIG. 9: (From [37]) Evolution of the density of 1-vertices, ni(t), under the operation of the leaf removal procedure as a function 
of the reduced time t/a. For a < ad — 0.818, n\(t) remains positive until all the constraints are eliminated at t* = a. For 
a > a d the procedure stops at the time t* for which m vanishes (black dots), and becomes negative for t > t* (dashed part of 
the curves). Notice that t* discontinuously jumps down at a — a d . In the inset, a' defined in Eq. (154) is plotted as a function 
of Q. 



According to the solution (149) of the equation of motion, this statement holds for ratios a for which the equation 

b=l-e- 3ab2 , (151) 
has no strictly positive root. When a is larger than 

In(l-ft)" 



ad = mm 

0<6<1 



3 6 2 



~ 0.818469... , (152) 



there exist non zero real solutions to (151), and we denote by b* the largest one. As shown in Fig. 9, the density 
of 1-vertices decreases and vanishes at time t* < a where b(t*), given in Eq. (150), reaches b*. The output of the 
procedure is a non-empty subset S' of S with M' = N (a — t*) equations, such that any of the 

N' = N J2n e (t*) (153) 

e>2 

variables present in S' appears at least twice. The ratio of equations per variable in S' is given by 

,_M'_ ab* 2 
a N' l-3ab*(l-b*) ' [ ' 

and is shown in the inset of Fig. 9. 

Let us briefly sum up the outcome of the above analysis. There exists a critical ratio ad (152) for the original 
system S such that 

• if a < ad, the leaf removal procedure succeeds in eliminating all variables and equations, thus S' is empty; 

• if a > ad, the output of the procedure is a non-empty subset S' , with ratio of equation per variables equal to 
a' (154). 

The reader could feel concerned about fluctuations around those average results. Equation of motion (147) is indeed 
true for the average density of vertices only. Fortunately, one can show that, for large sizes N, the numbers of 
i- vertices are highly concentrated around their average values, 



N e (T) = N ni (Jpj + o(N) , 



(155) 



whatever the initial instance S and the sequence of choices done by the removal procedure. Therefore, the evolution 
of the population cannot deviate from the average behaviour when N — >• oo [37]. 
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3. Clustering and sat-unsat transitions. 



As before, we consider a system S with ratio a and apply the leaf removal procedure. The output is the subsystem 
S 1 . Can we reconstruct the solutions of S from the ones of 5'? The answer is positive, and the reconstruction process 
permits us to characterize the structure of the set of solutions in an accurate way. 

Assume that we have, in the course of the removal procedure, stored all removed equations on top of each other in a 
stack. Let X' be a solution of S' . The length of X', that is, the number of variables it includes is N' defined in (153). 
Through a relabelling of variables we can always assume that X' specifies the values of the first N' variables Xi. We 
are now going to reinsert in S' the equations that were removed from S following a Last-In-First-Out unstacking 
order. 

Consider the first reinserted i.e. on top of stack equation, 

Xi + Xj + x k = v , (156) 

where v = 0, 1 is the second member, and k) a triplet of distinct integers comprised between 1 and N. Call v x 
the number of these integers strictly larger than N' . That this equation was eliminated by the leaf removal procedure 
and was the last one to be so tells us that v\ > 1: at least one of the three variables in (156) is not assigned by the 
particular solution X'. If v is precisely equal to unity, we have, say, i 7 j < N' and k > N' + 1. Then the values of 
known for X', and there is a unique way to assign x k to satisfy (156). If z^i — 2, we know, say, Xi from X' , 
and have two possible combinations of Xj, x k fulfilling (156). Therefore we can construct two distinct solutions of the 
system S[ = S'+ reinserted equation. The reasoning is straightforwardly extended to the last vi — 3 case, with the 
general result that 2" 1-1 distinct solutions can be reconstructed from X' . Then, after reinsertion of the second-to-top 
equation, we reconstruct 2 1/2_1 solutions from any of the solutions of S[ where vi is the number of new variables i.e. 
present in the reinserted equation and not fixed by S[ . Iterating this procedures up to the reinsertion of all M — M' 
equations permits us to obtain 

M-M' 

Kec= ]J ( 15? ) 
T=l 

solutions of S' M _ M , = S'+ all equations. Notice that solutions of S' M _ M , do not quite coincide with the the ones 
of the original system S. Variables that were absent from equations in S (i.e. that had zero connectivity) are still 
undetermined from S' M _ M ,, and can be freely chosen, giving an extra multiplicative factor to the total number of 
solutions of S that can be reconstructed from X', 

M-M' 

Af in = 2 N °M rec = 2 N ° [] 2 VT ~ l , (158) 

T=l 

where N a is the number of 0-occurrence variables. This set of solutions and X' are respectively called cluster of 
solutions and seed of the cluster. Taking the logarithm of (157) and dividing by N, we obtain the following expression 
for the entropy of the reconstructed cluster from seed X', 

The average value of this entropy can be calculated from the knowledge of the average values of the utS. As equations 
were carefully introduced in opposite order to their removal, the expectation value of vt — 1 (conditioned to population 
N(T)) can be estimated from the analysis of the previous section and is equal to 2pi(M' + 1 — T) with pi given in 
(145), and we end up with 



log 2 



e -3a 



where the first term comes from the contribution n (0) (148) of absent variables, and t* is the time at which the leaf 
removal procedure halts. 

For small ratios a, S' is empty, and so is X' . The halt time t* = a and integration of (160) with the help of (149) 
leads to the simple result s(a) = (1 — a) log 2. As we have reconstructed all possible solutions of S from the empty 
X', s coincides with the total entropy s to t of solutions, that in this case coincides with log 77. At such small ratios 
the fluctuations of Af are not important. 
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On the contrary, for a > the leaf removal procedure stops at t* < a, and has not succeeded in eliminating all 
equations and variables. The average entropy associated to a cluster reconstructed from one seed is 

^L = l-a-b*-ab* 2 (2b*-3) , (161) 

where b* is, as before, the largest positive root of (151). 

To complete our description of clusters, some statistical knowledge about their seeds is required. The number N 1 
of solutions of S' can be analyzed by means of the first and second moments method (132), giving respectively some 
upper and lower bound to the probability Prob[7V' > 1] of existence of solutions. The first moment is easy to calculate: 
as the second members of the various equations are uncorrelated, we find 

W=2 N 'x(±) =2 N '( 1 - a ') . (162) 

The overbar denotes here the unbiased expectation value over all instances with N' variables, M' equations such 
that any variable appears at least twice. The second moment calculation is made more difficult by the existence of 
contraints on the minimal number (two) of occurrences of variables in S' . It requires a combinatorial analysis of 
the number of systems S' i.e. ways of choosing equations, having a given pair of configurations for solutions. This 
calculation is beyond the scope of this notes and was done by Dubois and Mandlcr [69]. The outcome is that, at 
large N, the second moment {N') 2 is asymptotically equal to the first moment squared, {N') , when a' < 1 and 
exponentially larger when a' > 1. This result has two important consequences. 

First, from (132), we conclude that a' = 1 is at the same time an upper and a lower bound to the exact value of 
the threshold for S' . Therefore a' = 1 is the location of the sat-unsat threhold for reduced subsystems. Using (154), 
we see that a' = 1 is reached for an initial ratio equal to 

a s ~ 0.917936... , (163) 

as shown in Fig. 9. Remarkably, while the moments method directly applied to S gave lower and upper bounds to 
the threshold separated by a finite gap only, the outcome is much better when applied to <S'. The concentration of Af' 
constrasts with the (instance-to-instance) fluctuations exhibited by Af, and suggests that the latter thus essentially 
comes from fluctuations in the numbers No and iVi of 0- and 1-vertices eliminated by the leaf removal algorithm. 
This comes as no surprise since variations of Nq and N\ induce drastic changes on the number of solutions e.g. the 
presence of a 0-vertex multiply the number of GS by two. Conversely, in S', variables appear at least twice and are 
more interconnected, giving rise to weaker fluctuations for Af'. 

Secondly, when a' < 1, the number of solutions of S' is, with high probability, given by (162) 18 . This is the number 
of seeds each of which gives a cluster of solutions for S. The number of clusters of solutions is therefore 



Af clu = 2 N '^-<*')=e N * , (164) 



where the entropy of clusters is defined as 



^ = — (1 -a') = b* -3ab* 2 + 2ab* 3 . (165) 
log 2 N 

From the reconstruction process it is clear that two different seeds cannot give twice the same solutions. Therefore 
the total entropy of solutions of S is 

s tot (a)=E(a)+s(a) = (l-a)Iog2 , (166) 

as obtained when summing (165) and (161). This is exactly the analytic continuation of the entropy of the unclustered 
phase. We have now established that s tot = (1 — a) log 2 for all ratios a < a s . The entropies of solutions in a cluster, 
s, and of clusters, S, are shown in Fig. 10. 

What we are left with is merely justifying the cluster denomination used above. The reconstruction process allows 
a complete characterization of solutions, in terms of an extensive number of (possibly overlapping) blocks made of few 



18 This statement comes, again, from the fact that the second moment is equal to the squared first moment, and therefore fluctuations of 
Af' around the average value are extremely rare. 
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FIG. 10: (From [68]: entropies are in units of log 2) Ground state structure and entropies as a function of the ratio a of equations 
per spin. The total entropy (logarithm of the number of unfrustrated ground states per spin, or solutions) is s to t = (1 — a) log 2 
for a < a s ~ 0.918. For a < a d ~ 0.818, solutions are uniformly scattered on the JV-dimensional hypercube, with a typical 
normalized Hamming distance d = 1/2. At ad, the solution space discontinuously breaks into disjoint clusters: the Hamming 
distance di ~ 0.14 between solutions inside a cluster is much smaller than the typical distance do = 1/2 between two clusters 
(RSB transition). The entropy of clusters, s, and of solutions in each cluster, S, are such that S + s — s to t- At a s , the number 
of clusters ceases to be exponentially large (E = 0). Above a B , ground states are frustrated, i.e. there are no solutions. 

variables, each block being allowed to flip as a whole from a solution to another. When a < ad, with high probability, 
two randomly picked solutions differ over a fraction d = 1/2 of variables, but are connected through a sequence of 
O(N) successive solutions differing over 0(1) variables only. For ay < a < a s , flippable blocks are juxtaposed to 
a set of seed-dependent frozen variables. For picture 10 to be true, we must establish that different clusters do not 
overlap or, more precisely, that any two solutions belonging to two different clusters are far away from each other. A 
lower bound to the Hamming distance d between these two solutions is the Hamming distance d s between the seeds of 
their respective clusters. A lower bound d s , as a function of a, can in turn be estimated using again the first moment 

method i.e. from the vanishing condition of the expectation number N' 2 {d s ) of solutions of S' lying apart at distance 
< d s . An explicit calculation shows that d s > for all a > ad [67, 68]. 



The reconstruction procedure helps us to interpret equation (151) as a self-consistent equation for the size of the 
backbone of a cluster of solutions. Let X' be a solution of S' , and B the set of variables taking the same value in all 
solutions of the cluster associated to seed X'. The cardinality of B divided by N is the backbone b. Obviously, the 
variables assigned by X' are elements of B, thus 



showing that the backbone is strictly positive. So far no proof of the equality between b and b* has been obtained 
in literature, though this could, in principle, be inferred from the analysis of the removal and reconstruction proce- 
dures described above. The following is a hand-waving argument supporting the interpretation of b* as the cluster 
backbone [70]. 

We want to estimate the probability that a variable, say, x\ does not belong to the backbone i.e. that x\ takes 
value on some solutions and 1 on others. x\ appears in I equations, where I is a Poisson variable with parameter 
3 a. Consider one of these equations, 



4- On backbones 




(167) 



xi + xj +x k = v 



(168) 
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If either Xj or Xk does not belong to B, neither does x±. Thus the probability that the above equation does not 
constrain X\ to take the same value on all solutions is 1 — b 2 . This expression assumes that the events 'Xj G B ' and 
'Xk G B' are independent, a literally wrong assumption which, however, apparently becomes asymptotically true for 
large system sizes (this is the basis of the cavity method that we will discuss in the next section). Similarly, under 
the assumption that the above events are independent from equation to equation, we obtain 

Prob[a;i G B\£] = 1 - (1 - b 2 ) e . (169) 

Summing over £, we have 

Probfci eB] = J2 ^ " [1 - (1 - ^ 2 ) £ ] = 1 - e" 3 " 62 • (170) 

e=o 

As the l.h.s. of the above equation is precisely the size of the backbone, b, we see that b fulfills (151) and is equal to 
b* as announced. 



5. Summary 

Unfortunately, the method discussed in this section can be applied only to the XORSAT problem. In fact, the 
clusters in this problem have a very simple structure: they are built around a seed (solution of the reduced system 
S 1 ) by adding back the leaves. This has a number of peculiar consequences: 

1. In all clusters there is a finite fraction of "frozen" variables (backbone): these are the variables in the seed that 
are constrained to take the same value in all the solutions belonging to the cluster; and some of the leaves, those 
that belong to equations where both other variables belong to the seed (see the discussion in the last section). 

2. All the clusters have the same internal entropy: this is because adding back the leaves, the fraction of "free" 
variables concentrates in the termodynamic limit to the same number for all clusters. Moreover, all these 
variables (those that are not in the backbone) are completely free to be ±1 with probability 1/2. 

3. The SAT/UNSAT transition coincides with the point where £ vanishes: this is because when S = the reduced 
system becomes UNSAT. 

As we will see the structure of other optimization problems is more complicated: the internal entropy may fluctuate 
from cluster to cluster, and a backbone is not always present. In these cases we need a different method, that we 
explain in the next section. Before turning to that, it is useful to reproduce the results obtained above by mean of 
the replica method [70] (=> Ex.III.l). 

C. The replica symmetric cavity method 

The cavity method, initially invented to deal with the Sherrington Kirkpatrick model of spin glasses [9] , is a powerful 
method to compute the properties of ground states in many condensed matter and optimization problems. It is in 
principle equivalent to the replica method, but it turns out to have a much clearer and more direct interpretation, 
that allows in practice to find solutions to some problems which remain rather difficult to understand in the replica 
formalism: the replica approach is very elegant and compact, but it is more difficult to get an intuitive feeling of 
what is going on. Also, the cavity approach deals with usual probabilistic objects, and can lend itself to rigorous 
studies [15]. We shall present it here at two successive levels of approximation. The first one, corresponding in 
replica language to the replica symmetric (RS) solution, is an easy one and has already been studied a lot. The one 
corresponding to one step replica symmetry breaking (Irsb) is more involved and has been fully understood only 
very recently [43, 44, 66, 71-73]. 

1. Recursions on a finite tree 

The key properties of random graphs that is exploited by the cavity method is that loops are very large in the 
thermodynamic limit, as we discussed in the introduction to this section. Therefore, locally random graphs look like 
trees. Before studying the cavity method, we must understand how to solve statistical mechanics models defined on 
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FIG. 11: Example of an Ising tree model on 7 vertices. The definition of Z2->i and its recursive computation reads here: 

Z 2 -).l (<72) = ^23(02, Cr 3 )lp24(a2, CT4)tp4 5 (a4, a 5 )V'46(o"4, a G )lp47((T4, CT 7 ) = J2 Z 3 ^2{ff3)Z4^2(cr4)lp23{ff2,Cr3)lp24{ff2,Cr4). 

£T3,...,CT7 ^3i°'4 



trees. This can be done by a generalization of the transfer matrix methods that are used to solve models in one 
dimension (which is indeed the special case of a regular tree with connectivity c = 2). We will for the moment restrict 
to consider models where the clauses have connectivity k — 2: for these models, one does not need a factor graph 
representation, since interactions can be represented by a standard graph with vertices i = 1 , • • • , N representing 
variables Oi and links representing interactions %j)ij{ui,aj). We shall repeatedly use in the following the same 
notation we used for factor graphs, specialized to the k = 2 case: therefore we use di for the set of vertices adjacent 
to a given vertex i, i.e. for the sites which interact with i, and di \ j for those vertices around i distinct from j. 

Let us then consider the case where the interaction graph is a finite tree. In this case the computation can be 
organized in a very simple way, taking benefit of the natural recursive structure of a tree. We define the quantity 
Zi_).j((Ti), for two adjacent sites i and j, as the partial partition function for the subtree rooted at i, excluding the 
branch directed towards j, with a fixed value of the spin variable on the site i. We also introduce Zi{ai), the partition 
function of the whole tree with a fixed value of Uj. These quantities can be computed according to the following 
recursion rules, see Fig. 11 for an example, 



Z i ^ j {a i )= Yl wl Z < 

kedi\j \ ffc 



Z i{°i) = II I ^2 Z j^i( a j)^ij( a i' a j) ] • ( 171 ) 



It will be useful for the following discussion to rewrite these equations in terms of normalized quantities which 
can be interpreted as probability laws for the random variable Uj, namely rji^j(ai) — Z^j(ai)/ '^2 , Zi^j(a') and 
Vi( a i) = Zi{ a i) / J2<t> Zi{<7 r ). The quantity r?,->.j {pi) is the marginal probability law of variable <7j in a modified system 
where the link (i, j) has been removed. The recursion equations read in these notations 

rn^j{<Ji) = — !— [j (5>*-w(*fcM*fa,*fc)J , rn{ai) = — Y[ ( ^Vj^iivjWijivi^j) I , (172) 
Zi ^ j kedi\j U t / Zl jedi \ *, J 



where z^j and Z{ are normalization constants: 



= 5Z II ( ^2vk^t{cr k )iptk(crt,a k ) j , z t = ^ J| ^ 7] j -,i(a j )ip ij (a i , aj) \ , (173) 

Ci kedi\j V o-fc / <Ti jedi \ &j J 



Since the leaves are isolated when the link connecting them is removed, one has Z^j (cr,) =const. and r\i^j { a i) =const. 
for leaves. However, one can also choose to put an arbitrary rji^j{(Ji) on the leaves: this might represent an external 
field acting on them, or the effect of a given boundary condition. Moreover the quantity r/i(c7j) is exactly the 
marginal probability law of the Gibbs-Boltzmann distribution, hence the local magnetizations can be computed as 

mi = (at) = E ( r 7 ?i( cr ) cr - 

We can now write the free energy of the system. For this it is useful to define the object 



(174) 
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FIG. 12: Pictorial representation ol Eq. (180). The bubbles on the first panel represent subtrees of the graph; their effect on 
the spins <ri, . . . er c _i is summarized by ??cav, represented as a bold arrow in the second panel. Tracing over these c — 1 spins 
leads to the third panel. 



where the last two equalities are easily derived using Eqs. (172). Then, the total partition function is 

z=z i nz j ^ i n iif n ^■■■=w kI ir (175) 

jedi kedj\i jedi Zi: > kedj\i Zjk "W)^ 



and the free energy is 



F = -T\ogZ = J2h-J2^ > 

fi = -T log Zi , 
fij = -T log Zij . 



(176) 



On a given tree, the equations (172) for the rji^j for all directed edges of the graph have a single solution, which is 
easily found by propagating the recursion from the leaves of the graph. From this solution, one can compute the free 
energy using Eq. (176). 



2. From a tree to a random graph 

The reasoning of section III C 1 was made under the assumption that the interaction graph was a tree, and we 
arrived on the recursion (172). Suppose now that we are on a random graph, and we cut a link then we produce 

two "cavity" variables on sites i and j, whose connectivity has been decreased by one. As before, we define the 
quantity t]i-^j((Ji) as the marginal probability law of variable cr, in a modified system where the link has been 
removed. We can write an exact equation: 

Vi->j(<ri) = — ^— V(di\j)-n(Wk},k e di\j) JJ ip ik ((Ti, <r k ) , (177) 

Z% ^ 3 {cr k },kedi\j kedi\j 

where we introduced a multi-spin cavity field ??(ai\j)->i({cfc} J k G di\j) that describes the joint distribution of the 
spins in di \ j in absence of the links that connect them to i. The problem obviously is that in these way the equations 
are not closed. We can close the equations if we assume that the cavity variables are uncorrelated and write 

V(di\j)-n(Wk},k e di\j) = Yl Vk^i((7k) , (178) 

k£di\j 

which gives back Eqs. (172). Now, Eqs. (172) have to be interpreted as a set of coupled equations for the unknown 
rji^j . Contrary to the tree, on a graph with loops they cannot be solved by recursion. Still the number of equations 
is clearly equal to the number of unknown and we can hope for a unique solution. 
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The replica symmetric cavity method corresponds to the use of the local recursion equations (172) derived under the 
assumption (178) to compute the free energy of models defined on sparse random graphs, that are locally tree-like. 

How can we justify assumption (178)? The key observation is that, as we already discussed, random graphs converge 
locally to trees in the thermodynamic limit. Loops have typically length ~ log(iV). Therefore, if we cut the links 
between variables in di\j and variable i, in the modified system variables in di\j arc very far away. If there 
is a single pure state, then correlations in the Gibbs measure decay quickly with distance and Eq. (178) becomes 
asymptotically correct for N — > oo. In summary, the assumption of the existence of a single pure state implies that 
the effect of the loops does not spoil the existence of a unique solution to the local recursions (172). Their presence 
simply provides self-consistent boundary conditions. We conclude that Eqs. (172) provide an exact description of a 
disordered model on a random graph for N — > oo, in a phase where there is a single pure state. 

Note that one can look for a solution of the recursion equations (172) on any graph, even in the presence of short 
loops. Although this procedure is not exact in general, it might provide a good approximation to the true solution of 
the problem. This approach is known as Belief Propagation in inference problems [74] , and corresponds to the Bethe 
approximation of statistical mechanics [75] . 



To become more familiar with the cavity method, we now consider the simplest possible case. We specialize to a 
model such that 

1. The underlying graph is a random regular graph: the connectivity of the constraints is k = 2, so we do not need 
a factor graph representation, and the connectivity of the variables is fixed to c. We have therefore N variables 
and M = Nc/2 links (constraints). 

2. There is no disorder in the Hamiltonian, i.e. the constraints are all equal to a deterministic function of the 
involved variables. 

Simple example of models in this class are 

1. The Ising ferromagnet/antiferromagnet: H = —J2(ij) JSiSj; 

2. The Potts antiferromagnet H — X^(jj) &{°~ii ~j)i that corresponds to g-COL. 

3. A model of hard spheres, defined by rij € {0, 1} (occupation numbers) and local constraints that impose that if 
a site is occupied (m = 1) at least some of the neighboring sites cannot be occupied [76]. 

Therefore this class is already rich enough to show some interesting features and structures: actually, g-COL already 
contains the richest Irsb structure and further complications do not add much to the physical picture [71]. 

The main simplification in these models is that all sites are statistically equivalent, i.e. the local environment is 
tree-like (with probability 1 for N — > oo) without fluctuations of the connectivity or of the local interactions. The 
latter have the form ipij(o~i,o~j) — tp(<Ji,<jj) for a fixed function ip. Notice that in these systems, the frustration and 
the disorder (if any) are due to the presence of loops, and thus occur only on large scales (~ log./V/log(c — 1)). 

Consider a site i and the region around this site. For N — > oo, this region will be a tree with probability 1 and we 
can use the recursion (172) to compute everything inside the tree. The problem is that in this case the "leaves" are 
connected to the rest of the graph, which provides a boundary condition that has to be determined. Suppose that we 
are in a phase that is not frustrated. Then, we expect the system to be homogeneous. Suppose then that each of the 
leaves feel the same external field due to the rest of the graph. We initiate then the recursion with the same 770 (a) 
for all the leaves. Then, at each iteration, the 77's remain identical and satisfy the recursion: 



3. Replica symmetric cavity equations in absence of local disorder 




'c-l 



Vg(°~i) ■ ■ ■Vg{°~c-i)ip(o-,o-i) ■ ■■ip(a,a c - 1 ) . 



(179) 
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FIG. 13: Illustration of Eq. (181); the local magnetization of one site is computed by taking into account all the c neighbors. 



Since the tree region around site i can be arbitrarily large, we have to iterate this recursion for a very large number 
of times, and it is natural to assume that we will converge to a fixed point 19 rj cav which satisfies the relation 



^cavM = — ^— ^ Wcxvivi) ■ • ■»7cav(CT c -l)V'(CT,CTi) • • ■ 1p((T, <7 c _l) = — !— T] cav (a')^((T, a') 



c-1 



(180) 



CTl ,...,<T C _1 



with z cav is the normalization constant. A pictorial representation of this equation can be found in Fig. 12. The local 
magnetization is then computed as 



(181) 



including the c neighbors of a central site as represented in Fig. 13. Although we fixed a particular site i at the 
beginning, this reasoning is clearly independent of the site in the thermodynamic limit, since in that limit all sites 
have the same environment around them. Therefore, we expect that in this limit all the fields rji^j (or equivalcntly 
T] g ) will converge to 7/ cav , solution of (180), and all the fields f]i((r) will converge to rj{a) defined in (181). 

We can now compute the free energy of the system. We start from Eq. (176) together with (173) and (174). We 
observe that, under the homogeneity assumption made above, all Zi — and Zij — z^ are equal with 



logz (z) = log (cri^cav^)^!;^) 



and 



logZ (s) =log ^2 %av(CTl) ' ' •^cav(ff c )V'(CT,CTi) • • -?A(ct,CT c ) = log^ I J^?7 cav (cr')^(cr,cr') j . 

(7,<Ti ,- - - ,<7 C <J \ (j' / 

Recall that for a random regular graph the number of links is Ac/2. Then we get 

f= — = _ -f(i) 
j 2 

/« =-Tlogz( s ) , 
/<'> = -Tlogz« . 



(182) 



(183) 



(184) 



In presence of spontaneous symmetry breaking, for instance for a ferromagnetic system at low temperature, there might be several fixed 
points. In ordered systems, one can usually select one of them by adding a suitable small external field to break explicitly the symmetry 
and obtain a system with a single pure state. 
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FIG. 14: An example, for the case c = 3, of a Gn,6 cavity graph where q — 6 randomly chosen cavity variables have c — 1 = 2 
neighbours only. All the other N — 6 spins outside the cavity are connected through a random graph such that every spin has 
c = 3 neighbours. 



The cavity method (here presented for an homogeneous state of a model without disorder on a random regular graph) 
consists in solving Eq.(180) and using then Eq. (184) to compute the free energy. Obviously, any other observable 
can be computed from the free energy by adding suitable external fields. 



4- An alternative derivation 

Before moving to more complicated cases, it is useful to present an alternative derivation of the cavity method, that 
also explain the historical origin of its name. This derivation was the original one of [43] and this section is reprinted 
from that paper. 

Let us introduce an intermediate object which is a model with TV variables, on a slightly different random lattice, 
where q randomly chosen "cavity" variables have only c — 1 neighbours, while the other N — q spins all have c 
neighbours (see fig. 14). We call such a graph a GN, q "cavity graph". The cavity variables are characterized by some 
joint probability distribution T) cav (ai, ...,a g ). 

While our primary interest is in Gn,o graphs, the intermediate construction of GN,q is helpful. The basic operations 
which one can perform on cavity graphs are the following: 

• Iteration: By adding a new spin cr of into the cavity, connecting it to c — 1 of the cavity spins, say ci, <7 c _i, 
one changes a GN,q into a QN+i,q-c+2 graph: 

AN = l, Aq = -c + 2. (185) 

• Link addition: By adding a new link between two randomly chosen cavity spins o\ , 02 , one changes a GN,q into 
a GN,q-2 graph: 

AN = 0, Aq = -2 . (186) 

• Site addition: By adding a new spin a into the cavity, connecting it to c of the cavity spins say a\,...,a c , one 
changes a GN, q into a GN+i,q-c graph: 

AN = 1, Aq = -c. (187) 

In particular, if one starts from a Gn,2c cavity graph and perform c link additions, one gets a Gnu graph, i.e. our 
original problem with N variables. Starting from the same Gn,2c cavity graph and performing 2 site additions, one 
gets a Gn+2,o graph, i.e. our original problem with N + 2 variables. Therefore the variation in the free energy when 
going from N to N + 2 sites (Fjy+2 — Fn) is related to the average free energy shifts AF^ for a site addition, and 
AF^ for a link addition, through: 



F N+2 — Fj\f — 2AF {s '> - cAF (z) 



(188) 
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Using the fact that the total free energy is asymptotically linear in N, the free energy is finally 

/ = lim F N /N = Fn+2 - Fn = AF« - -AFU . (189) 

TV— >oo 2 2 

An intuitive interpretation of this result (for even c) is that in order to go from N to N + 1 one should remove c/2 
links (the energy for removing a link is minus the energy for adding a link) and then add a site. 

Now we should derive the equation for n cav (a). When q/N <C 1, generically, the distance on the lattice between 
two generic cavity spins is large (it is of the order of the size of the loops, therefore diverges logarithmically in the 
large N limit). We will therefore assume that: 

1. Different cavity variables become uncorrelated. This is true only if the Gibbs measure is a pure state, and the 
clustering property, stating that distant variables are uncorrelated, holds. 

2. The states of the system before and after any of the previous graph operations (e.g. iteration) are related. 
Equivalcntly one should assume that the perturbation corresponding to the variation of one of the cavity spins 
remains localized and it does not propagate to the whole lattice. This is clearly a decorrelation hypothesis that 
is very similar to the previous one. 

Under these hypotheses, which are reasonable for a non frustrated homogeneous phase, it is very simple to compute 
the free energy shifts defined above. 

The important point is that our hypotheses imply that the joint probability of the cavity variables factorizes; we 
have 77cav(ci, • • • ,a q ) — «7cav( cr i) ' ' ' Vcavi^q)- To simplify the problem, let us assume that the cavity distributions 
are identical, if cav {o-) = r) cav (a), for all cavity variables because of our assumption that the local environment is the 
same everywhere in the graph (no fluctuations of connectivity and interactions). This is not an harmless assumption, 
because in general frustration might induce different biases on different variables; we will come back to this in next 
section. 

Consider now an iteration where a new cavity variable is added and connected to c — 1 cavity variables. The new 
cavity variable has a probability distribution 

?7cav(cr) = — !— ^ Tl ca , v {(Ji)----q ca , v {(j c ^i)tl){(j,ai)---tl){a,(J c ^i) , 

/ vc-i (190) 

Zcav = ( X! Vcav(o-')lp(o; 0-')\ 

that must again be identical to the old ones; this is because we assumed that the local environment does not fluctuate 
and that, for large N , the state of the system remains the same after any of the above operations. Therefore (190) is 
a self-consistent equation that can be solved to obtain i] cav (a). We got back Eq. (180). 
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Once this is done, we can compute the free energy shifts for a link and site addition. Consider a link addition 
between two cavity spins <j\ and o~ 2 . Before addition, we have 

Zbefore = y^TTV^Oq) = Z(<J 1 ,<T 2 ) , 

° ° aia2 r7l , (191) 

( \ 1 \ i \ Z(ai,a 2 ) 

77cav(Ol,0 2 ) = ?7cav(CTl)»7cav(0-2) = ~= , 

^before 

where Z (01,02) is the trace of the Gibbs measure over all variables except the two cavity ones. When we add a link 
(i.e. a constraint) connecting a\ and 02, we have 

Z a fter = ^ 1p(<?l , Vl) TT (°"«) = X! ^ (^1 ' ^2 ) Z{(7\ , (72 ) = Z&e/ore r] cav {(J 1 )-q cscv {(J2)lp{(Jl , 2 ) , (192) 

(7 (2 <7lCT2 cr l cr 2 

and we obtain the final result 

logz w = log ^ aiter = log ^(0-1)77^(^2)^(^1, o 2 ) , (193) 

^be/ore 

which coincides with (182). With a very similar argument one can show that 

logz (s) =log ^ %av(oi) • • •77cav(o c )7/;(oo,o-i) • • -V(o-o,o- c ) = logj^ I ^?7 cav (cr')V , (o-,cr') J • (194) 

o a ,<?l,--- ,cr c cr \ cr' / 

which gives back (182). Putting these results in (189) we get back Eq. (184). Finally, note that from T7 cav ((7) we 
can reconstruct the true marginal probability of the variables r)(a), defined as the trace of the Gibbs measure over all 
variables but a. To do this we consider a graph with c cavity spin ai, ■ ■ ■ ,a c and perform a site addiction to get the 
real graph; we get 

P{(?) = -j^y 77cav(oi)---?7cav(o c )-0(o,cri)----0(cr,cr c ) = -Ly ( ^ ^cav^'M ") O 7 ) J . (195) 

Cl,— > CT c V cr' / 

which is (181). This concludes our derivation; we have then presented two slightly different but equivalent derivations 
of the cavity method. 

A very important property of the free energy (184) is that it is variational? . one can easily show (=> Ex. III. 6) 
that the functional equation dr< — ^ = gives back Eq. (190). In particular, this allows for a simple computation of 

the energy e — and entropy s — because it is enough to compute the explicit derivative with respect to T 

without deriving with respect to J7 cav (cr). The result is 21 : 

s = logz (s) - c/21ogz (() +f3e , 
e = e {s) -c/2e {l) , 

e W = T^jy y~] 77 cav (CTi)77 cav (CT2Moi,o- 2 )£(o-i,o- 2 ) , (196) 



cr\(7i 



e {s) = -^y Y ? 7cav(o'i)---?7c a v(o-c)V'(o-o,o'i)---V(o-o,o-c)[^(o-o,o-i)H h E((T , a c )] . 

crc>,cri,--- .cr c 

The equations above can be applied to many problems on random graphs, such as the Ising ferromagnet (=> Ex. III. 2) 
and q-COL (=> Ex.III.3). 

In the case of un-frustrated ferromagnetic models it has been shown rigorously [77] that the assumption we made 
on the way are correct, the predictions of the cavity method being exact in the thermodynamic limit, both for the 
local magnetizations and for the free-energy per site. 



20 However, it cannot be proven that the extremum is a minimum in general. 

21 Recall that we defined ip(cri,<T2) = exp[— f3E(ai, CT2)]. 
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5. Fluctuations of the local environment: distributions of cavity probabilities 



We will now allow for the presence of local fluctuations, in the form of quenched random couplings, and fluctuations 
of the variable connectivity. We show here how to take the average over this quenched disorder. We will still restrict 
to k = 2 for simplicity. In this more general setting, the cavity variables are not equivalent: they do not have the 
same distribution, because of the explicit disorder present in the Hamiltonian, or in the graph, or both. 

In principle we could take a given realization of the disorder (graph and couplings) for finite (but large) N, and 
try to find a solution of (172). This can be done in some cases numerically, but it is not very practical. If we 
are interested in computing the free energy averaged over the disorder, we can introduce a distribution of cavity 
distributions, V[rj C av(^)], that gives the probability (over the links i — > j) that variable Uj has a cavity probability 
distribution: 

V[Vca.v{(r)] = Prob [r/j^. j (o-j) = n cw {(Ji)] . (197) 

If we take the average over the disorder (both the couplings and the random graph) , then this distribution must be 
the same for all sites, since all sites are statistically equivalent. 

The equation for V is deduced by using Eq. (172) and imposing that the probability distribution of rji^j is the 
same as that of the f]k->-i ■ Defining 

»7cavM = 7 7cav( cr l)---»7ca; 1 ( cr c-l)V'l(ffO,Cri)---^c-l(cro,Cr c _l) = .Fj frcav > " " " .tv] > (198) 

(71,-" ,<T C -1 

where the interactions ipi ■ ■ ■ ip c -i may contain one realization of the disorder, one obtains 

■P[v° cav } = J dVlvU ■ ■ ■ dViv^ 1 } % c ° av - J^j[ V l av , ■■■ , ^av 1 ]] , (199) 

where the overline denotes an average over the couplings J and over the distribution of c. This equation has the 
following interpretation: one must extract the rf c&v from V, and produce a new n® &v that must also be typical of V ■ 
The function Tj may depend on the couplings that appear in the constraints; these have to be extracted from their 
distribution at each iteration. 

Moreover c might also be a random variable, and an important remark on the distribution of c is in order. Suppose 
that P(c) is the distribution of the connectivity of the variables of the random graph. We should not use P(c) to take 
the average in Eq. (199). The reason is the following. The cavity distributions rji^j arc defined on the links of the 
graph. When we perform a cavity iteration using Eq. (172), we fix an oriented link i — > j and consider all the other 
links k — > i that are connected to site i. Since V[rj cav {a)\ is the histogram over the links of cavity distributions, to 
derive Eq. (199) we must take a random link, not a random site. The average in Eq. (199) is taken over the probability 
P(c) that, picking uniformly at random a link i — > j, the variable i has connectivity c. It is easy to show that 

ho . sQ£» , 

oo C (200) 
c = ^cP(c) , 

c=0 

and the average over c in Eq. (199) must be taken with the distribution P(c). 

To compute the free energy one has to average (176) over the distribution V that solves the previous equation. It 
is easy to see that the result is 



f = W {V ' C ' J} -^ iV ' J} , (201) 



2 



where 



/ (s){P ' C " /} = dV[riU ■ ■ ■ dVivU log 



E - 1 

.(7,(Tl ,<7 c 



/ ''cav (CTl ) ■ ■ ■ Vca,v (Cc)V>l (<T, CTi ) • • • 1p c (a, 0~ c ) 



■{c,J} 



(202) 



X " VLv (<7l )?7c 2 av MV>(CTl , (J 2 ) 



f w{V "' } = -tJ dvlnUdvl^iog 

We note that here the site term must be averaged using P(c), since it is an average over all sites, see Eq. (176). 
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6. The zero temperature limit 



As we already discussed there are two ways of taking the zero-temperature limit. The first is to assume that the 
system is in the SAT phase and therefore the ground state has zero energy. In this case a solution to the constraints 
exist and we can just take infinitely hard contraints (ft — > oo). The energy is zero as can be seen from (196), and the 
entropy is simply log Z and is given by the first of (196) without the last term (3e. 

On the other hand we might be interested in a situation where not all the constraints can be satisfied at the same 
time and the ground state energy is non-zero. In this case the limit j3 — > oo of Eq. (190) does not make sense since 
the normalization constant might vanish when a contradiction is met. 

In this case one has to take care. For simplicity, let us focus on a spin glass Hamiltonian H = — Y]uj\ JijSiSj for 
Ising spins [44]. The denominator in (198) is given by z cav <~ e ~^ e ^ v at leading order for large (3. We introduce a 
cavity field h^j by the parametrization, again at leading order for (3 — >• oo: 

rn-+j(Si) ~ e^^H^I) . ( 2 03) 

The cavity equation becomes, substituting the previous parametrizations and taking the leading order for (3 — > oo: 

c-1 

h S - \h \ = e cav + + ■faS | - N) • (204) 



We can write 



Then we get 



»=i 



\hi + JojS'oI = a(hi, Joi) + S u(hi, J M ) , 

a{hi, J i) = -(\hi + J i\ + \hi - J i\) , (205) 
u{hi, J 0i ) = lj(\hi + J 0l \ - \hi - J 0i \) . 



c-1 



ha = ^u{hi,J Ql ) , 
i=l 

c-1 

e cav = -\ho\ + ^2[\hi\ - a(h i: J 0i )} . 



(206) 



i=l 



Similarly we get 



e« = M + \h 2 \ - max[/n5i + h 2 S 2 + J12S1S2} 

Sl,S2 

c c 
i=l i=l 



(207) 



If we assume that the Jij are drawn from a given distribution P(J), and we introduce the distribution of cavity 
fields V(h), we obtain [44] the zero-temperature limit of (199) 

r-{c,J} 



V(h) = J dhiVityS^h-J^ufaJoi)^ , (208) 
where the overline is an average over P(J) and, if one wishes, also on fluctuations of c. The ground state energy is 

—--{V,c,J} C-T^iV,.!} 

e = e( s ) ~2 ' ^ 209 ^ 

i.e. the site and link energies (207) have to be averaged over the distribution of J and c, and over V(h), as in Eq. (201). 

The solution to these equations for the case J = ±1 with probability 1/2 can be found in [44]. Many other problems 
have been studied using this zero-temperature "energetic" formalism, including q-COL [78] and fc-SAT [64, 65]. In 
the following we will focus more on the "entropic" limit in which one assumes to be in the SAT phase and studies the 
structure of the solutions. 
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7. On a factor graph 

Let us conclude by generalizing the previous equations to the case of a factor graph. We will only briefly sketch the 
derivation, more details can be found in [79]. 

In this case we have two type of nodes, variables and constraints. It is convenient to introduce cavity variables 
as before: they are variables connected only to c — 1 constraints. We also introduce cavity constraints: they are 
constraints that are connected only to k — 1 variables. 

To each cavity variable we associate its cavity distribution J7 cav (cr). To each cavity contraint, we associate a 
distribution r]t es %{a) that is defined as follows: imagine to add a fc-th missing variable a to the cavity constraint and 
only to it. Then ?7tost (c) is the distribution of this variable. 

Recall that the number of constraints is M — cN/k. We denote by Gn,m,p,q a graph with N variables and M 
constraints, of which p are cavity variables and q are cavity constraints. 

We have five possible operations: 

1. Variable iteration: We add a new cavity variable and connect it to c — 1 cavity constraints. 

2. Constraint iteration: We add a new cavity constraint and connect it to k — 1 cavity variables. 

3. Site addition: We add a new variable and connect it to c cavity constraints. 

4. Constraint addition: We add a new constraint and connect it to k cavity variables. 

5. Link addition: We add a new link connecting a cavity constraint and a cavity variable. 

We can start with a Gn,m,o,o graph and delete ck links. Now we get a graph GN,M,ck,ck as each link deletion 
produces a cavity variable and a cavity constraint. To the latter graph we add k new variables, that have to be 
connected to the ck cavity constraints, and c new constraints, that have to be connected to the ck cavity variables. 
We thus produce a graph with N + k variables and M + c constraint (note that the relation M = cN/ k still holds) , 
GN+k.M+cfifi- Then we get 

F(N + k)- F(N) = kf = kAF^ + cAF^ - ckAF^ . (210) 

The iteration equation are the following. When we add a new cavity variable cr, we connect it to c — 1 constraints. 
The influence of each constraint on the new variable is independent from the others and given by rj t cst(o~). Then we 
have 

1 c_1 

?7cav(>o) = J| ^cstOo) = ^"cav^test ' ' ' Vtcsl] > ( 211 ) 

cav i=0 

where in principle each constraint can produce a different distribution. 

When we add a cavity constraint ip a , we connect it to k — 1 cavity variables <ti, • • • , Ok-i- To compute the cavity 
distribution we need to add also a fictitious variable o that is connected only to the new constraint. Its distribution 
is then 

*7t°cstM = — r ?cav( (7 l)--- J 7cav 1 ( CT fe-l)V'a(CT0,<Ti,--- ,a k -l) = FteAvLv ' ' ' Vc^} • ( 212 ) 

Ztest 

Finally we need to compute the free energy shifts. When we add a new constraint, k cavity variables become 
connected and 

logzW = log J2 fcavfa) ' ' ' VcMMo-l, ■■-,**)■ (213) 

When we add a new variable, c constraints are connected to it and give independent influence, then 

c 

logz^log^n^tH . (214) 

When we add a link, we connect a cavity variable a with distribution 77 cav (cr) with a cavity constraint whose influence 
on cr is ?7 tcst (cr). Then 



logZ W = log^77 cav (cr)77 tcst (cr) 

cr 



(215) 
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It is easy to check that the equations above reduce to the ones for a normal graph for k = 2 (=> Ex. III. 4). 

In presence of site fluctuations all the considerations we made in the k = 2 case can be easily generalized. The 
same holds for the zero temperature limit. The generalization of the formalism to the factor graph allows to discuss 
for instance the case of XORSAT (=> Ex.III.5). 



Before turning to the lRSB equations, let's summarize this discussion. The replica symmetric cavity method works 
when the Gibbs measure is a single pure state. Under this assumption, cavity variables are uncorrelated for N — >• oo 
since loops are very long. The equations derived on a tree can then be used to describe a factor graph: 

• On a given system of finite (large) size N, one can use Eqs. (172) and (176). 

• For a homogeneous system (such that locally there is no disorder in the couplings and the graph), in the 
thermodynamic limit all the cavity fields are equal and the RS cavity equations reduce to Eqs. (180) and (184). 

• In presence of local disorder, one must introduce a distribution of cavity fields over the disorder; the cavity 
equation for this object is given by Eq. (199), and the free energy can be obtained from Eq. (201). 

Similar equations are obtained for systems defined on factor graphs, and in the zero temperature limit. 



In the one-step replica symmetry breaking scenario, we will assume that the Gibbs state is split in a large number 
of states, as we obtained for the spherical p-spin model. Each state has a weight w a oc cxp(—(3Nf a ) in the partition 
function. As discussed in section II B 3, in such a situation we wish to compute 



from which the thermodynamics of the system can be reconstructed. 

Recall that the central hypothesis of the RS cavity method is that (distant) cavity spins are uncorrelated and the 
Gibbs state is stable (for N — > oo) under the operations on the graph defined above. Both these properties are false in 
the presence of many states, because i) the Gibbs state is not a pure state and the decorrelation (clustering) property 
does not hold, and ii) the free energy of the states are shifted when operating on the graph, and this might change the 
relative weight of the states in the partition function, therefore changing the nature of the Gibbs state. The treatment 
of the free energy shift is quite complicated [43]. Therefore in the following we will not use the derivation of the lRSB 
cavity equations based on the graph operation of section III C 4. This derivation can be found in [43] . We will present 
instead a derivation based on the idea of first writing the recurrence equations on the tree and then justify its use on 
the random graph by a decorrelation assumption [80]. This derivation has also the advantage that it is formulated 
for a single graph and choice of the couplings, so one does not need to take the average over the disorder. 

In the RS treatment, the key property is the factorization of the joint distribution of the cavity variables, expressed 
by Eq. (178), that leads to the closed equations (172) for the cavity messages. Based on the discussion above, in 
presence of many pure states we can only assume that Eq. (178) is true for the messages f]f-^j restricted to one pure 
state. A precise formalization of this hypothesis is the following. By definition, we must be able to select one state 
by acting on each spin with infinitesimal field hf. If we take first the limit N — > oo and then the limit hf — > 0, we 
will end up in the state a. For N — > oo in presence of hf, we have a single pure state and we can use the RS cavity 
equations to obtain a set of messages Then we can take the limit hf, and the resulting messages will describe 

the state a in absence of the external field. Therefore the VtU-j are a nxe d point of the RS cavity equation (172). The 
free energy of a state is given by equation (176), calculated in the fixed point rff_+j. While finding the fixed points 
analytically is not possible, we could hope to determine them numerically. The problem is that often, in presence of 
many fixed points, an iterative solution of equation (172) is not possible because the recursion will not converge if 
one starts with random messages. Still, in many cases we do not need to know the full set of fixed points. We only 
want to count how many solutions have a given free energy /, to compute the complexity, and this can be done by 
mean of Eq. (216). 



8. Summary 



D. 1-step replica symmetry breaking 




a 



(216) 
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1. The auxiliary model 

Let's summarize the situation, and rewrite once again the relevant equations (172) and (176), in the case k = 2 
(simple graph) for simplicity. We should find all the solutions of the RS equations 



Vi->j{Vi) ~ 



Zi ^ j kedi\j 



n ( y^vk^ii^Aki^i^k) j = Ji-»j[??fe^i,fc g ^\ j] , (217) 



and make use of the Bethe free energy to obtain the free energy of each state: 



where 



-PF Bethe [{rn^j}] = J^logz, - lo E z ij . ( 218 ) 

<T« k£di\j \ <?k / 

<Ji jedi \ Cj J 

In the equations above, we also added a local term ipi((Ti), which may represent a local field ipi(<Ji) — e^ hi(Ji . One can 
easily understand how to place this factor by doing the calculation on the tree. The solution of the equations above 
gives the free energy of a generic statistical mechanics model having the partition function 

z = e n^fo) n tvMwj) ■ ( 22 °) 

{ffi} » (i,3) 

Note that on random graphs, the cavity messages vf-^-j pl & y the role of the local magnetizations in fully connected 
models: they fully specify a given state a of the system. We can write the partition function (216) as an integral 22 
over the messages r\i^j : 

Z m = e-P™^^ = f Vru^j e-f jmF ^^ J] 6 fa-i ~ ^ 6 ^ ~ ^ ( 222 ) 

(id) 

where on each link we have two messages, rji-^j and rjj->i, and the two delta functions enforce the RS cavity 



22 For continuous messages, it is not completely clear what is the correct integration measure Vrji^j in Eq. (222). In principle, the delta 
functions require the introduction of a determinant of the second derivatives since one should write 



3 "if Bethe k 



-^"ifBethehnj] 



/ 



dru^je 



mflWu-nl I HSl^j-^j] ) dct 



•Fi-tj) 



9(Vi 



dr/ k . 



(221) 



Here we don't discuss this issue in details and just neglect the determinant. A discussion, based on a discretization of the cavity 
distributions, can be found in [80, pag.436] (thanks to P.Urbani for pointing out this problem). 
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equations (217). Using (218), we get 

z m = f vm^j I] Z T II z v ms to-* - 6 to-* ^ 

= I vv^j n ( z ? n s fa-* ^ ) n % m 

(id) 

To simplify the notations we will often omit the arguments of the different functions. There are two interaction terms 
in the last equation above. The first term is a product over all the sites i of a term ^ that depends on all the messages 
involving site i. The second term is a product over all the links of a term that depends only on the messages 
living on that link. 

We can therefore interpret Eq. (223) as the partition function of a statistical mechanics model, where the variables 
are the messages. On each link of the original graph, we have a variable made by the two messages {rji-^j, with 
a local field z~j m ; on each site of the original graph, we have a many-body interaction z™ IIjea» ^ [Wi-tj ~ We 
can therefore re-interpret the original graph as a factor graph, where on each link there is a variable node, and the 
original variable nodes act as interaction nodes. 



2. RS cavity equations for the auxiliary model: the 1RSB equations 

We can write the RS cavity equations for the auxiliary model in a compact way by introducing the message 
Qi^jiVi^jiVj^i)' which is the probability distribution of the two messages sitting on when the connection 

between this link and node j is absent. 

A quite straightforward calculation, similar to the one of section III C 1, leads to 

Q i ^ j (r]i-, j ,r] j -,i) = ^v^i^iVj^i) ^ ^i({Vi^-i,Vi-n}iedi) ]J Qk^i{Vk^i,Vi->-k) (224) 

,_>J {Vi^k,Vk-n}keBi\j kedi\j 

Using the explicit expressions of "J^ and ^>ij we obtain 

ZijiVi^-jiVj-ti) m 



Zi({Vk^i}kedr) m ]J 6 ~~ II Qk^i(Vk^i,Vi^k) 

l ~* 3 {Vi^k,Vk^i}keOi\j kedi\j 



^ Zi{{rik^i\kedi) m b~ [Vi->j ~ ^i->-j[{Vk->-i}k€di\j]] Yl Qk^iiVk^-ijFi-tk) 



Z- ■ 

{Vk^i}k£di\j kedi\j 

~2. f z i->-j({llk->-i}kedi\j) m fi [Vi->3 ■- ^i^j[{Vk^i}kedi\j]] Yl Qk->i{Vk->-uFi->k) 

l ~* 3 {nk^i}kedi\j kedi\j 



(225) 



where in the second step we used the delta functions to integrate over the and in the third step we used the 

identity Zi/zij = see Eq. (219). This last simplification makes rjj-n disappear from the last line of the equation 

above. The only point where rjj-^i appears in the right hand side of the above equations is in the argument of T^k 
inside the function Q. Therefore, a consistent choice is to assume that Qi^j does not depend on r/j^i, or in other 
word Q does not depend on its second argument. Using this assumption, we finally obtain 

Qi^jiVi^j) = — !— ^ Zi^j({Vk^i}kedi\j) m S [vi^j - ^j[{Vk^i}kedi\j]] J| Qk^iiVk^i) ■ (226) 

l ~* 3 {Vk^i}kedi\j kedi\j 

Note that Qi^j is the probability distribution of r\i^,j over the states, with weight e ~ /3mFa , or in formula: 

Qi^iVi^j) = E e-^ mF —^Mv^j Vt+j] ■ (227) 
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Following the same steps as in section III C 1 (the details are slightly different because of the different graph structure 
of the auxiliary partition function), we find that the "replicated" free energy is 

-0N$(m, T) = log Z m = log Zi - log Zij , 

i ij 

Z l = Yl ^({%^}feeai) m II = , 

Zij = ^ z ij{Vi^jiVj^i)" l Qi^j{ r li^j)Qj^i( r lj^i) ~ ( z ij) a 
Vi-H>Vj-n 

Eq. (226) and Eq. (228) constitute the set of lRSB cavity equations for a given graph and choice of the couplings. 
The reader has probably already guessed that 2RSB equations could in principle be obtained by performing a lRSB 
calculation on the auxiliary model, and so on. Unfortunately the complexity of the calculation is already prohibitive 
at 2RSB so we will not explore further this possibility. 



(228) 



3. Homogeneous lRSB equations 

As we did in the RS case, we can now consider a system where there are no spatial fluctuations: the graph is regular 
and the coupling are all equal (an important example is the coloring of random regular graphs, that as we already 
said corresponds to the Potts antifcrromagnetic model). 

In that case, it is natural to assume that the distribution does not depend on the particular site. Even if 

in a given glass state a the local fields are different, once we take the average over the states, all the sites must 
be statistically equivalent in the thermodynamic limit. Then we have Qi_>. j (r^j ) — > Q c &v(Vc&v) for N — » oo, and 
Eq. (226) becomes: 

Qcavfacav] = [ ^Qfocav] ' ' ' d Q[Vcw] 6 [Vce,y ~ F[vL v , ' ' ' , ^cav]] Kavfrc; 

^cav J 

= F{Q[ ?7c 1 av ]...g[r, c c av 1 ]} , 

where the function T is defined in (198). 

Now we can compute the free energy. This is done by noting that in Eq. (228) all the terms are equal if Qi^j —¥ Q. 
We get 

z« = J dQlnUdQlnU {z { %L v ,vL v ]} m , 

z[l) = ^cavO^cavM^CTl,^) . 

and similarly 

Z^ = JdQ[ v U--dQ[v^] {^[icaV-.Cv 1 ]} 

with given in (183). The free energy $(m, T) = — \ogZ m is given by 

$(m,T) = -T[ log - | log Z®] . (232) 

The equation above is still variational; differentiating it with respect to Q[?7 cav ] leads to the self-consistency equation 
(229). 

A final remark is in order to conclude this discussion. Eq. (229) is dangerously reminiscent of equation (199), 
that corresponds to the RS case in presence of local fluctuations (indeed, the two equations are formally equivalent at 
m = 0). These two equations should not be confused. There is a deep physical difference between the two distributions 
Q[rj\ and V[rf[: the former describes the fluctuations over the many states for a given sample, while the latter describes 
the fluctuations over the samples of a single pure state. For this reason, in (199) the weight z™ v is absent. The physical 
difference between these equations is better seen if we compare the free energy (201) with the one we derived here, 
Eq. (232). In the former case, the average over V[rf\ represents an average over the disorder and is taken outside the 
logarithm. In the latter case, the average over Q[rf\ is over states for a given sample, and therefore it is taken inside 
the logarithm. 



-liim 

(229) 



(230) 



(231) 
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4- Complications: spatial fluctuations, factor graphs 

Now it should be clear how to introduce spatial fluctuations. Instead of considering a single Q[^ cav ], we must 
introduce a distribution over the sites, V [Q[j? C av]l > defined as the probability that a cavity variable has distribution of 
cavity fields <5[?7 C av] over the states. The equation (229) now depends on the disorder, i.e. Fj depends on the coupling 
and the number c might fluctuate. Similarly to (199) we get 

■{c.J} 



V[Q]= J dP[Q 1 ]"-dP[Q c - 1 ]<5[Q-Fj[Q 1 ,--- ,Q C ~ 1 }] . (233) 

The free energy has now to be computed according to (232) taking an external average over c, J, "P. 

In the case of a factor graph all these considerations are easily generalized along the lines of section III C 7. We 
must introduce a Q C av[f?cav] and a Q test latest] > with recursions 



(234) 



Qcavfacav] = [ <*<3test [vlest] ' ' ' d Qtest [Vtest] ^cav ~ -^cav fotest > ' ' ' : Vtest}] Kav [vlest » 1 " " > Cst]}™ 

^cav J 

= F cav {Q t est [??test] ' ' ' QtcstfatcTst 1 ]} , 

Qtcst latest] = / rfQcav[77cav] ' ' ' dQcaAVcw] %tcst ~ -Ftest [vLv , ' ' ' , ^]] { Ztcst facav > ' ' ' , ^cav*]} 

-^•test J 

= FtestjQcav^cav] ' ' ' Qcav [Vc^] } • 

The free energy $(m, T) is computed as in (210) by replacing z — > Z as in the k = 2 case: 

$(m,T) = AF< S > + £af< c > - cAF^ = -TflogZ^ + C - logZ^ - clog .2^] . (235) 

In presence of external disorder one introduces distributions "P cav [Q[j? C av]] > ^test [Q latest]] j of these fields and equa- 
tions like (233), 



Pcav[Pcav] = J dV test [Q\ est ] ■ ■ ■ dPtest [Q^] ^Qcav - F J, C av [QLt , ' ' ' , Qtcst]] 
V tcst [Ptert] = 1 rfPcav [Qcav] ' ' ' ^cav [Qcav 1 ] 5 [Qtest - F j ttest [Q\ av , ■■■ , Qcav 1 ]] 



{J,c} 
■U} 



(236) 



and the free energy (235) has to be averaged over J, c, and V . 

The explicit solution of the Irsb cavity equations is possible for the case of fc-XORSAT (=> Ex. III. 7). This is 
a very useful exercise that will allow to familiarize with Irsb cavity equations, and to check that we can obtain in 
this way the same result that we already obtained by mean of the leaf removal algorithm and the replica method (=> 
Ex.III.l). 



E. Phase transitions in g-COL 



We wish to conclude the discussion of the cavity method by presenting the spectacular results that have been 
obtained for the g-coloring of random graphs. These results are particularly interesting because, at variance with 
XORSAT, (/-COL has a nontrivial S(s) which seems to be common also to other optimization problems [63, 66, 79]. 
In this case the lRSB equations have to be solved numerically. As we already discussed, g-COL corresponds to an 
antiferromagnetic Potts model. There is no disorder in the coupling and one has only two-body interactions, hence 
there is no need to introduce a factor graph representation. For regular random graphs, there is no local disorder at 
all. At the lRSB level, one can use therefore Eq. (229) and (232) to compute the free energy. For Erdos-Rcnyi graphs, 
there is local disorder due to the fluctuations of the connectivity and one needs to solve Eq. (233). There are several 
tricks that help the numerical resolution of these complicated equations. We do not discuss here the details and we 
refer to the original paper [63] , from which the following discussion is reprinted. 

Consider that we have q > 4 colors (the q — 3 case being a bit particular [63, 66], as we shall see) and a large Erdos- 
Renyi random graph whose average connectivity c we shall increase continuously. Different phases are encountered 
that we will now describe (and enumerate) in order of appearance (the corresponding phase diagram is depicted in 
figure 16). 
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# Cluster without frozen variable 
Cluster with frozen variables 



Colorable phase 



Uncolorable phase 




(iv) 



(v) 



(vi) 



connectivity 



CLUSTERING 



CONDENSATION 



RIGIDITY 



UNCOL 



FIG. 16: (From [63]; replace c c by ck in the figure) Sketch of the space of solutions — colored points in this representation — 
in the g-coloring problem on random graphs when the connectivity c is increased, (i) At low c, all solutions belong to a single 
cluster, (ii) For larger c, other clusters of solutions appear but a giant cluster still contains almost all solutions, (iii) At the 
clustering transition Cd, it splits into an exponentially large number of clusters, (iv) At the condensation transition ck, most 
colorings are found in the few largest of them, (v) The rigidity transition c r (c r < ck and c r > ck are both possible depending 
on q) arises when typical solutions belong to clusters with frozen variables (that are allowed only one color in the cluster) . (vi) 
No proper coloring exists beyond the COL/UNCOL threshold c s . 



(i) A unique cluster exists: For low enough connectivities, all the proper colorings are found in a single cluster, 
where it is easy to "move" from one solution to another. The Irsb equation reduce to the RS ones and the 
cavity fields are uniform. The entropy can be computed and reads in the large graph size N limit 

s to t = ^ = log? + ^log(l-i). (237) 
Jy 2 q 

This corresponds to the region T > Ttap in figure 2. 

(ii) Some (irrelevant) clusters appear: As the connectivity is slightly increased, a Irsb solution appears and 
the phase space of solutions decomposes into a large (exponential) number of different clusters. It is tempting 
to identify that as the clustering transition, but it happens that all (but one) of these clusters contain relatively 
very few solutions — as compare to the whole set — and that almost all proper colorings still belong to one single 
giant cluster. Clearly, this is not a proper clustering phenomenon and in fact, for all practical purpose, there 
is still only one single cluster. Equation (237) still gives the correct entropy at this stage. This corresponds to 
Ttap > T > Td in figure 2. 

(iii) The clustered phase: For larger connectivities, the large single cluster also decomposes into an exponential 
number of smaller ones: this now defines the genuine clustering threshold Cd- Beyond this threshold, a local 
algorithm that tries to move in the space of solutions will remain trapped in a cluster of solutions [72]. Inter- 
estingly, as in the case of the fully connected p-spin model and of fc-XORSAT, it can be shown that the total 
number of solutions is still given by equation (237) in this phase. This is because, as we already discussed 
for the p-spin, the free energy has no singularity at the dynamical transition (which is therefore not a true 
transition, but rather a dynamical or geometrical transition in the space of solutions). This region corresponds 
to T K <T < T d in figure 2. 

(iv) The condensed phase: As the connectivity is further increased, a new sharp phase transition arises at the 
condensation threshold ck where most of the solutions are found in a finite number of clusters (the largest). From 
this point, equation (237) is not valid anymore and becomes just an upper bound. The entropy is non-analytic 
at ck, therefore this is a genuine static phase transition. This correspond to T < Tjc in figure 2. 

(v) The rigid phase: Recall that in XORSAT there was a finite fraction of frozen variables (the backbone) in 
each cluster. Here the situation is different and two different types of clusters exist: in the first type, that we 
shall call the unfrozen ones, all spins can take at least two different colors. In the second type, however, a finite 
fraction of spins are allowed only one color within the cluster and are thus "frozen" into this color. It follows 
that a transition exists, that we call rigidity, when frozen variables appear inside the dominant clusters (those 
that contain most colorings). If one takes a proper coloring at random beyond c r , it will belong to a cluster 
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where a finite fraction of variables is frozen into the same color. Depending on the value of q, this transition 
may arise before or after the condensation transition. 

(vi) The UNCOL phase: Eventually, the connectivity c s is reached beyond which no more solutions exist. The 
ground state energy (as sketched in figure 7) is zero for c < c s and then grows continuously for c > c s . The 
values c s computed within the cavity formalism are in perfect agreement with the rigorous bounds derived using 
probabilistic methods and are widely believed to be exact (although this remains to be rigorously proven) . 

Precise values of all threshold connectivities corresponding to all these transitions are reported in [63] for the regular 
and the Poissonian (i.e. Erdos-Renyi) random graphs ensembles. The peculiarity of 3-coloring is that Cd — ck so that 
the clustered phase is always condensed in this case. 

F. Exercises 

1. XORSAT with replicas: The aim of this (long) exercise is to re-obtain the results on the clustering in 
XORSAT discussed in section III B by means of the replica method discussed in section II B 3. Consider k- 
XORSAT on an Erdos-Rcny graph of mean connectivity c = ak. Recall that this means just that the variables 
entering in each equation are taken independently at random. It is convenient here to use boolean variables, 
X = (xi, ■ ■ ■ ,x N ), and the form of the constraints is x^ + ■ ■ ■ + x ik = bi, with b t a random boolean variable. 
In the following we discuss the problem for general k but one might first try to do the exercise for k = 3 for 
simplicity. 

In the replica method (section II B 3) we wish to compute the entropy of m coupled replicas S(m). The 
partition function of one replica is Z = J\f = J2x ^P0> i- e - the number of solutions. Based on the discussion of 
sections II B 4 and II B 5, we have: 

5(m) - -^fogN^ = -J= lim n d nWm) n = ^ lim n d n 7^ , (238) 

iV iv n— >0 iv n— >0 

where the v — mn replicas are divided in blocks of m coupled replicas. Then we want to compute all the 
moments of N. Note that the computation of the first two moments has already been discussed in section IIIB. 

• Following the same route than for the second moment, show that 



M 

W = E n i ( Xl )---W= E ]p(X\---,X")] M , (239) 

X 1 ,— ,X" a=l X X ,---,X» 

where p(X 1 ,--- ,X V ) is the probability that all configurations X a are solutions of a randomly drawn 
equation. 

• Denote X = (X 1 , • • • , X v ) and x — (x 1 , ■ ■ ■ , x v ); denote by and 1 the vectors of all and 1 respectively. 
Define x + y = (x 1 + y 1 , ■ ■ ■ ,x u + y v ). Show that, for large N, 



1 hN 

£(xi, ■■ ■ ,Xk) = ^ [S(xi H \-x k = 0) + S(xi H h x k = 1) 



(240) 



Introduce the function 



1 N 

P(S\X) = -^{S = Si) , (241) 



i=l 

note that it is normalized to 1 when summed over x, and show that 



p(X)= J2 p{xi\X)--- P {xk\X)£{x lr -- ,x k ) . (242) 



Xi,— ,x k 
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Using the previous results, denote by r(x) a generic normalized function, and show that 



AT ?7 = y Dr{x)M[r{x)] 



^2 r (^i) • • • r{x k )£(x!, • • • ,x k ) 



M 



(243) 



where .M[r(:E)] is the number of replicated configurations X giving rise to the same p(x\X) = r(x). Show 
that the latter is given by the multinomial factor 



M[r(x)} 



Nl 



U S (Nr(x)\) • 

• Take the large N limit with M — aN and deduce that 



(244) 



M u = exp \ N max 

r(x) 



^r(x)logr(x) + alog ^ r{x x ) ■ ■ ■ r{x k )£{x u ■ ■ ■ ,x k ) 



(245) 



Make the following ansatz for r(x): 

i h n r 1 

r(^) = -^- + &II ^(s(^=6) + 5(x k = l)) 



fe=i 



(246) 



where x k — (^i+ m (fc-i) , • • • ,x mk ) is the vector of the replicas in the fc-th block. Compute the overlap 
between two replicas; using spin notations, S = (— l) x , 



Q ab = = $>(a-)(-ir°(-ir 



(247) 



Show that it is equal to 6 if the two replicas are in the same block, and zero otherwise. Based on the 
results of section IIIB, give a justification of this ansatz and interpret b as the fraction of variables in the 
backbone. 



Substitute this ansatz in (245); show that N mn — exp{iVmax{, S(m, n; b)} with 

1-6 



S(m, n; b) 



2" | — 

2" 



<2rnn j \ 2 n 



(2 m "-2")- — ^log ' h 



+ alog( J—(l-b k ) + -^-b k 



and deduce that (for the interesting case m < 1): 
iS(m) = min lim d n S(m, n; b) 

b n— 7-0 

= log(2) min {b + m(l - 6) + 6 fc (m - l)a 



(248) 



(m - 1)(1 - b) log(l - 6)} 



(249) 



• Write the equation for b and check that it docs not depend on m. Using Eq. (128), deduce the expressions 
of s(m) and £(m). Note that S(m) is linear in m, therefore for each value of a, s(m) and £(m) do not 
depend on m. Show that this gives back Eq.(165) and (161). 

2. The Ising ferromagnet: Consider the Ising model on the fixed connectivity random lattice. The cavity 
probability can be parametrized as r] cliV (S) = 1+ " c5 . Show that the recurrence equation (190) in terms of the 
magnetization m c becomes 



m c = tanh[(c — l)atanh[mtanh(/3J)]] 



(250) 



Show that there is a phase transition from a paramagnetic to a ferromagnetic phase; compute the critical 
temperature T c . Express the real magnetization m that enters in P(S) = 1+ ™ s in terms of m c ; compute the 
critical exponent /3 associated to m ~ |T — T c \@ . 
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3. RS solution of g-COL: Consider the g-coloring of fixed connectivity random graphs at zero temperature. 

• Assume that the replica symmetric solution is uniform over all possible colors, 77cav(c) = 1/g; show that 
this is indeed a solution. Compute the entropy s(c, q) = log q + c/2 \og((q — l)/<?); note that it vanishes for 
a given value of c <~ qlogq for large q. 

• Consider solving the cavity equation by iteration, at each step using the right hand side to compute a 
new estimate to the solution until convergence. Study the stability of the uniform solution under this 
process. In other words, consider a small perturbation of the uniform solution, ij cav (a) = l/q + Sr] cav (cr), 
X)cr ^cav(o") = 0. Linearize the cavity equation to obtain a linear equation for 5rj cav {a). Show that the 
perturbation decays exponentially under iteration for c < q while it grow exponentially for c > q. Then 
the uniform solution is unstable for c > q. See [71] for an interpretation of this instability. 

4. A consistency check: Show that for k = 2 the factor graph cavity equations reduce to (190) and the free 
energy reduces to (184). 

5. XORSAT: Consider the 3-XORSAT problem with fixed or fluctuating connectivity, at zero temperature in the 
SAT phase ("entropic" cavity method). Show that the uniform solution rj cav (S) = i] tcst (S) = 1/2 is indeed a 
solution of the iteration equations on a factor graph. Deduce that the zero-temperature entropy is s = (I— a) log 2 
as found using the leaf removal algorithm. 

6. Variational principle: Show that differentiation of (184) with respect to f] cav (a) gives back (190). Keep in 
mind that r] cav (a) must be normalized! Repeat the calculation also in the factor graph case. 

7. Explicit solution of Irsb equations for XORSAT: Consider k- XORSAT on an Erdos-Renyi graph of mean 
connectivity c = ak. Variables are represented by Ising spins and the form of the constraint is ip a (Si, • • • , Sk) = 
S(J a Si ■ ■ ■ Sfc = 1), where J a = ±1 with uniform probability. 

Solve the iteration equation as follows: 

• For a given graph, variables may be in the backbone or not. According to the analysis of section IIIB, this 
depends only on the topological structure of the graph. Then the cavity fields 77 (for both cavity spins and 
cavity tests) can be of three different types: rj(S) = 1/2, if the variable is not in the backbone; rj(S) — Sss 
or T](S) = 5s, -1 if the variable is in the backbone. 

• If the variable is not in the backbone, then for all states a it is free; the distribution 

Q[ V (S)]=S[r ] (S) = 1/2} =A 1/2 . (251) 
If the variable is in the backbone, then it is frozen to ±1 with uniform probability, 

Q[V(S)\ = ls[ V (S) = 6 SA ] + ±5[r,(S) = 5 s ,-i] = A ±1 . (252) 

Show that the latter statement is a consequence of the symmetries of the problem. 

• Assume that the lRSB distribution over the sites of Q[?7 cav ] has the form 

V[Q[vc, v }] = WfQfacav] - A±i] + (1 - b)5[Q[ Vcav ] = A 1/2 ] , (253) 

where b is the probability (over the sites) that a variable is in the backbone. 

• Plug the equation above into the second Eq.(236); show that 

^QW] - ^"^[Qfatest] - A±i] + (1 - ft*" 1 )* [Qfotert] = A 1/2 ] , (254) 

• Now plug the latter expression in the first Eq.(236); show that for a given c one has 

^[Ql^cav]] = (1 - (1 - ^"^-^[Qfrcav] = A ±1 ] + (1 - b^f- 1 5 [Q[r? cav ] - A 1/2 ] . (255) 

Show that the average over I = c — 1 > must be taken using a Poissonian of average ak; for this follow 
the the reasoning before Eq. (200). Take the average over c and show that one gets back 

V[Q{vc, v }] = (1 - f(b))S[Q[r, cav ] = A±i] + f(b)S[Q[r, ca , v ] = A 1/2 ] , (256) 
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with f(b) = e akb> " 1 . Then b must satisfy the equation 

1 - b = f(b) = e- akb "~ 1 , (257) 

that gives back Eq. (151) obtained with the leaf removal. Show that for a < only the solution b = 
exist, and show that it gives back the RS solution. For a > ad, there is a solution b* ^ 0. Note that b* 
does not depend on m. 

Compute the free entropy (128) from Eq.(235) (just drop the — T to get the free entropy, and take the average 
of the different terms over J,c,V. The result is 

=]ogZ® = -(\og2)[b k +m(l-b k )] , 
S {c) = \ogZ^ = -(log2)[6 fe + m(l-6 fe )] , (258) 
5 (s) =logZ (s) = -(log2)[m(ak-l + b-akb k - 1 )+akb k - 1 -b] . 

Then S(m) = — limr-^o (3<&(m, T) is linear in m and one gets a single value of £ and s for all m. This gives 
back Eq.(165) and (161). 

IV. CONCLUSIONS AND PERSPECTIVES 

The main message of these notes was that mean field spin glasses are characterized by the existence of many pure 
states, among which some are stable equilibrium states and others are metastable. We discussed different models 
(spherical p-spin, SK model, optimization problems such as XORSAT and q-COL) that share this feature, and some 
methods (the replica and cavity method, and in the case of XORSAT the rigorous leaf-removal method) to compute 
the complexity and other properties of these states. 

This particular property of spin glasses (that is not shared by all disordered systems) raises many intriguing 
questions. For instance, what is the influence of the presence of these states on the dynamics? This question is also 
relevant for the analysis of search algorithms in optimization, that can be regarded as (non-equilibrium) dynamics for 
the corresponding physical system. 

And what about non mean-field models? Are so many states present also in finite dimensional models? And how 
can they be described? 

These questions did not receive a complete and satisfactory answer at present and are active research topics. Still, 
many important advances have been made. Here it follows a list of references that may be consulted to go deeper 
into these fascinating problems. The list is very incomplete and strongly biased towards mean-field inspired work; it 
is intended only to stimulate the curiosity, and the reader is strongly encouraged to look for further references. The 
articles cited below have also been chosen because they are useful sources of more references on the same subject. 

1. The equilibrium dynamics of the spherical p-spin model can be completely solved in the paramagnetic phase, 
and it can be shown that a dynamical transition takes place at the temperature Td where states first appear. 
The transition is charaterized by the divergence of the relaxation time. These results are reviewed in [12, 13, 46]. 
The same can be proven for the equilibrium dynamics of XORSAT, sec [72, 81]. 

2. In optimization problems one wants to find the ground state of the system. The simplest way to do that is to 
consider a dynamics satisfying detailed balance at temperature Tj, and then reduce the temperature down to 
T = (or a low temperature T — Tf) at a given rate 7 (classical annealing). In the limit 7 — > 00 one just 
istantaneously quenches the system from Tj to Tf. For the spherical p-spin one can show that if Tf < Td, the 
system falls out of equilibrium and starts to age; its energy decreases but approaches the energy of the threshold 
states. Therefore classical annealing cannot find the ground state for this system, as it is always trapped by 
higher energy metastable states. The aging dynamics of this and many other models is reviewed in [13]; for the 
specific case of the SK model see [82] . In some cases it is not at all obvious to understand which states dominate 
the aging dynamics, see [83] and in particular [84] for a very detailed discussion of this point using the cavity 
method. 

3. More generally, one can consider dynamics that do not satisfy detailed balance. In the case of optimization 
problems, many algorithms designed to search for solutions falls in this class. These algorithms are known to 
undergo algorithmic transitions: the probability (over formulas and randomness built in the algorithm) to find a 
solution decreases abruptly from 1 to (N — > 00) when the density of constraints is increased over a value a a . Is 
a a related to some property of the equilibrium states (their existence, the presence of frozen variables, • • • )? This 
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is mainly an open problem. See [85] for a review of algorithms that have been studied with methods borrowed 
from physics, and [86] for an original perspective on the general connection between (free)energy landscapes and 
algorithms. 

4. The Irsb spin glass transition has been conjectured to describe the glass transition in finite-dimensional particle 
systems. This is based on the following observations: 

• The equations that describe the equilibrium dynamics of the spherical p-spin closely resemble the Mode- 
Coupling equations that describe liquids close to the glass transition. Sec e.g. [13]. 

• The existence of an exponential number of states between Td and Tk at the mean field level is impossible 
in finite-dimensional systems. Therefore one has to re-discuss the definition of these states. This leads 
to very important ideas, such as entropic- driven nucleation, that might explain the dynamics of liquids at 
temperatures below the Mode-Coupling regime. A pedagogical discussion of the definition of states in finite 
dimension can be found in [7] and [87], as well as in Appendix A of [88]. A very successful theory of glasses 
based on the adaptation of the mean-field scenario has been developed by Wolynes and goes under the 
name of Random First Order Theory; a review is [89], see also [18, 19]. Quantitative replica calculations 
of the thermodynamics are reviewed in [55, 88]. 

• A detailed investigation of the dynamics of liquids close to the glass transition revealed the existence of 
dynamical heterogeneities, namely of regions of the sample that are more mobile than others. This defines 
a dynamical correlation length, related to the typical size of the heterogeneities, that seems to diverge at 
the glass transition. Different theories account for heterogeneity; in particular mean-field like equations 
predict the existence of such a correlation length and its divergence at Td [90-92] . 

• The aging dynamics of glasses is very similar to the mean-field one [12, 13]. 

• This mean-field-like scenario has been derived also for Kac versions of lRSB spin glasses in finite dimension; 
see in particular [93-95]. 

It is important to keep in mind that, as the transition is first-order in mean-field, its understanding in finite 
dimension is mostly related to nucleation phenomena. 

5. On the contrary, the fRSB transition of the SK model is a true second-order critical point. A natural question 
is whether finite-dimensional spin glass models on cubic lattices with nearest-neighbor two body interactions, 
like the Edward-Anderson model, also undergo a fRSB spin glass transition. This is very much debated; a 
classical alternative picture is presented in [96]. Recent reviews of the status of the mean- field approach are 
in [6, 97, 98]. In this case one would like to apply to the problem the whole machinery of standard second 
order phase transition, like scaling, renormalization group, computation of upper/lower critical dimensions, etc. 
Unfortunately, this is very difficult, see e.g. [50], and for the moment most of the results come from numerical 
simulations. 

6. It is very important to understand how glassy systems respond to external drives. For instance, one can shear 
a liquid close to the glass transition, or consider a granular driven by an external tapping, etc. This situations 
are often met in experiments and in many practical applications. A seminal paper in this respect is [99], where 
it was shown that the structure of states of the spherical p-spin model gives rise to a complex behavior of the 
system when subject to an external drive. This led to the prediction of a "complex rheology" in liquids close 
to the glass transition and colloidal systems [100] that is able to explain most of the phenomenology of these 
materials subject to external drives. 

7. Finally, the role of quantum fluctuations for the glass transition has to be elucidated. Is it possible to have 
many pure states in a quantum system? Quantum versions of the spherical p-spin model have been solved 
in [101] using the replica method, and quantum TAP equations have been discussed in [102]. It was shown 
that the spin glass transition becomes first order at low temperature (and down to T — 0) as a function of 
the quantum fluctuations parameter (e.g. a transverse field). However, much less is known for non-mean field 
models; already on random graphs, the development of a quantum version of the cavity method is a very recent 
achievement [103, 104]. 
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